Simulating network requests to test a new feature in @distributed
You know how when you follow an account on Mastodon you don't get to see any of the users older posts unless someone else on your instance follows them? Well if it's a distributed press site it'll attempt to "backfill" your instance with all the older posts once your follow request is accepted. 😎
@mauve @distributed that might have unintended side-effects
@thisismissem Probably! What sort were you thinking of?
@mauve like flooding peoples timelines with the backfilling of posts
@thisismissem I assumed timelines would sort by published time no? I'm gonna run some tests before deploying this to production ofc
@mauve I think there's a bug to the contrary
@thisismissem In mastodon? i'd love to see the source code for it if you know what part of it it'd be in.
@mauve I don't have it off top of my head but follow from /app/lib/activitypub/activity/create.rb
@thisismissem It seems to be taking the created_at part out at least. 🤔 https://github.com/mastodon/mastodon/blob/main/app/lib/activitypub/activity/create.rb#L122C7-L122C17
I'll look more after testing. Gonna let my personal instance take the bulk o9f the damage :P
@mauve yeah, I've heard of like, birdsite bridge that did similar backfilling clogging up people's timelines, but that might be specific to that implementation.
@thisismissem @mauve so... imo the ActivityPub way of doing this is to publish your last X posts at an "outbox" endpoint specified in your profile. Then a remote server following you can parse the outbox and get some content to fill things with. I did this with some software I was building and Mastodon just... didn't grab any of the outbox posts and I was confused as to why not
@darius @thisismissem @mauve yeah the outbox is there to support "web browser" use cases, fetch the latest activities and page through to see history
the thing with mastodon and timeline sorting is that the timeline sorting algorithm isn't using creation date, it's using arrival date. this is so someone can't create a post set in the far future to have it pinned at the top of your timeline.
@darius @thisismissem @mauve so delivering an activity ~now would insert the post at the top of the TL even if it was backdated in the `published` property.
@trwnh @thisismissem @mauve Oh interesting. But also: why not use arrival date for future dated posts and creation date for back dated posts? (Genuine question btw. I'm sure I'm missing some weird edge case race condition)
@darius @trwnh @thisismissem @mauve Because of Announce, as the obvious case. But more generally, when an object was created doesn't have any real connection to when it should be presented to people. Mastodon is doing more or less the right thing, here.
I wish mastodon would be more prompt in backfilling from the outbox, so that this wouldn't even seem necessary. But even that is basically the right behavior.
@jenniferplusplus @trwnh @thisismissem @mauve I see what you mean by time created vs when it should be presented. That said, if we consider the context of pulling from an outbox to backfill, then the context seems pretty clearly (to me, I could be wrong) "do not show this in the home timeline, this is just for filling in our database for views more generally"
DOES Mastodon pull from the outbox? I didn't observe it doing it at all
@darius @trwnh @thisismissem @mauve My understanding is that mastodon does pull from the outbox, but it may not happen immediately. There's background tasks that fetch remote objects into the local cache on some schedule, and I haven't looked into the implementation details.
@jenniferplusplus @darius @trwnh @thisismissem Are you sure? I have yet to see this happen in any impls. We did it in reader.distributed.press specifically because nobody else seemed to be doing it which is a major PITA for me. :P
@mauve @darius @trwnh @thisismissem I'm pretty sure mastodon sends a request to the outbox endpoint. I don't know what they do with that, in part because I don't have a useful outbox yet.
@jenniferplusplus @mauve @darius @thisismissem at most i think they might pull the totalItems to get a post count for your profile?
@darius @trwnh @jenniferplusplus @thisismissem I think I'd rather "spam" the timeline with a few posts in a row than have users miss those posts forever. Maybe we can increase the interval between posts for backfill to limit "spam", but not having them at all is awful UX for anyone trying to read stuff on the fediverse.
@darius @mauve @trwnh @jenniferplusplus @thisismissem Heh I never understood the "no polling in ActivityPub" part -- the Outbox is *right* there :)
@dmitri @darius @mauve @jenniferplusplus @thisismissem i think evan would like to remind people that AP supports both push and pull/poll, yeah :p
@trwnh @dmitri @darius @mauve @jenniferplusplus there was actually a whole big controversy a few months ago where people didn't understand that activitypub wasn't just push based, and that pulling data was core to the protocol too.
Lead to accusations of scraping, when that wasn't the case at all.
@mauve @trwnh @jenniferplusplus @thisismissem idk, I think we need to distinguish between pushed and pulled data. We say there is no polling in ActivityPub but stuff like the outbox does provide a place to poll and I think pushed data should be handled fundamentally different from pulled data.
If you're fetching a profile for the first time it seems obvious to me it should not drop straight into anyone's live feeds. But also: that's a fetch and not someone publishing something to a local inbox