Simulating network requests to test a new feature in @distributed
You know how when you follow an account on Mastodon you don't get to see any of the users older posts unless someone else on your instance follows them? Well if it's a distributed press site it'll attempt to "backfill" your instance with all the older posts once your follow request is accepted. 😎
@mauve @distributed that might have unintended side-effects
@thisismissem Probably! What sort were you thinking of?
@mauve like flooding peoples timelines with the backfilling of posts
@thisismissem I assumed timelines would sort by published time no? I'm gonna run some tests before deploying this to production ofc
@mauve I think there's a bug to the contrary
@thisismissem In mastodon? i'd love to see the source code for it if you know what part of it it'd be in.
@mauve I don't have it off top of my head but follow from /app/lib/activitypub/activity/create.rb
@thisismissem It seems to be taking the created_at part out at least. 🤔 https://github.com/mastodon/mastodon/blob/main/app/lib/activitypub/activity/create.rb#L122C7-L122C17
I'll look more after testing. Gonna let my personal instance take the bulk o9f the damage :P
@mauve yeah, I've heard of like, birdsite bridge that did similar backfilling clogging up people's timelines, but that might be specific to that implementation.
@thisismissem @mauve so... imo the ActivityPub way of doing this is to publish your last X posts at an "outbox" endpoint specified in your profile. Then a remote server following you can parse the outbox and get some content to fill things with. I did this with some software I was building and Mastodon just... didn't grab any of the outbox posts and I was confused as to why not
@darius @thisismissem @mauve yeah the outbox is there to support "web browser" use cases, fetch the latest activities and page through to see history
the thing with mastodon and timeline sorting is that the timeline sorting algorithm isn't using creation date, it's using arrival date. this is so someone can't create a post set in the far future to have it pinned at the top of your timeline.
@darius @thisismissem @mauve so delivering an activity ~now would insert the post at the top of the TL even if it was backdated in the `published` property.
@trwnh @thisismissem @mauve Oh interesting. But also: why not use arrival date for future dated posts and creation date for back dated posts? (Genuine question btw. I'm sure I'm missing some weird edge case race condition)
@darius @trwnh @thisismissem @mauve this is always a bit messy as posts can get skewed if servers don't correctly label timezones, but that is a good practice, yes.
@darius @trwnh @thisismissem @mauve Because of Announce, as the obvious case. But more generally, when an object was created doesn't have any real connection to when it should be presented to people. Mastodon is doing more or less the right thing, here.
I wish mastodon would be more prompt in backfilling from the outbox, so that this wouldn't even seem necessary. But even that is basically the right behavior.
@jenniferplusplus @trwnh @thisismissem @mauve I see what you mean by time created vs when it should be presented. That said, if we consider the context of pulling from an outbox to backfill, then the context seems pretty clearly (to me, I could be wrong) "do not show this in the home timeline, this is just for filling in our database for views more generally"
DOES Mastodon pull from the outbox? I didn't observe it doing it at all
@darius @jenniferplusplus @thisismissem @mauve nope, mastodon p much ignores the outbox
and yeah it makes sense to *not* do timeline insertion when backfilling, backfilling seems like the kind of thing that is only/primarily useful for viewing a profile and not keeping up with current activities
@darius @jenniferplusplus @thisismissem @mauve another way to look at it is that there are really two dates, one is the date the object claims to be published, and the other is the date that the activity arrived in your inbox. most fedi dev is used to thinking in terms of objects, but it would be more technically correct and spec-accurate to think in terms of activities
@trwnh @jenniferplusplus @thisismissem @mauve one of my goals for my next year of funded work is to make it easier for devs to think in AP terms rather than name-your-API terms
@darius @jenniferplusplus @thisismissem @mauve hey if there's room for me lemme know! that's a laudable goal that i'd be happy to cooperate on
@trwnh @darius @thisismissem @mauve objects have dereferenceable URIs. Why wouldn't I make that a first class entity in my data model?
@jenniferplusplus @darius @thisismissem @mauve activities do too, as well! or they should. activities are also objects.
the advantage of thinking in terms of activities is that it's a better representation of reality, with AP serving as a specialization of LDN (Linked Data Notifications), you're basically notifying your followers/audience/recipients that "something happened". it's the reason we POST Create activities and not just raw non-activity objects.
@trwnh @jenniferplusplus @thisismissem @mauve and also for indexing search!
@darius @trwnh @thisismissem @mauve My understanding is that mastodon does pull from the outbox, but it may not happen immediately. There's background tasks that fetch remote objects into the local cache on some schedule, and I haven't looked into the implementation details.
@jenniferplusplus @darius @thisismissem @mauve p sure mastodon will pull from the following
- `featured` collection (pinned posts) when a profile is discovered for the first time
- `replies` collection when a post is encountered for the first time (but only the first page of that collection, which in mastodon always contains self-replies, with all other replies being forced into page 2 at least)
@trwnh @jenniferplusplus @darius @thisismissem @mauve this is good to know! when we were testing with users they were confused they couldn't find any previous posts on their instance and thought the integration wasn't working, even though mastodon's ui has a really small message (!) saying it isn't pulling this information. that's why we're backfilling but clearly missed out on this bit of activitypub folklore :B
@jenniferplusplus @darius @trwnh @thisismissem Are you sure? I have yet to see this happen in any impls. We did it in reader.distributed.press specifically because nobody else seemed to be doing it which is a major PITA for me. :P
@mauve @darius @trwnh @thisismissem I'm pretty sure mastodon sends a request to the outbox endpoint. I don't know what they do with that, in part because I don't have a useful outbox yet.
@jenniferplusplus @mauve @darius @thisismissem at most i think they might pull the totalItems to get a post count for your profile?
@jenniferplusplus @darius @thisismissem @mauve yeah you can simulate the same effect with spinning up a *oma instance and following some people then taking it offline for a bit. when you come back online your home tl will be out of order because post deliveries get retried at different intervals lol
@thisismissem @darius @mauve this is a good way to have data loss from time zone related issues
i think mastodon has a window of 12 hours for HTTP signatures for this reason
For the curious: https://github.com/hyphacoop/social.distributed.press/pull/87