Other accounts:
All of my comments are licensed under the following license
Yeah pretty much. There is also a weighting based on the percentage of comments in that community that come from that user.
Actually it did so thx for that.
I don’t think it was included because there were no new comments made after august 1.
I had to try scraping the websites multiple times because of stupid bugs I put in the code beforehand, so I might of put more strain on the instances than I meant too. If I did this again it would hopefully be much less tolling on the servers.
As for the cost of scraping it actually isn’t that hard I just had it running in the background most of the time.
Yeah I’ve noticed there aren’t many clusters that encode specific ideas (there are a few like the anime, nsfw, or sometimes instance level clusters). Most of it just seems to be a blend. Sorta disappointing.
Yeah that sounds like a good idea so you can see how connected local communities are. Probably makes more sense to use original dimensions so no extra information is lost.
Something that I find interesting is how close the central clusters of beehaw.org, slrpnk.net, and lemmy.blahaj.zone are together. If you only highlight those instances then you see how close their communities tend to be.
Total communities: 2986
Total users: 21934
So the dimensions were reduced from (2986, 21934) to (2986, 2)
Edit: Also yeah it is using Umap for the algorithm and it does do something pretty similar to what you described.
I was somehow able to get both a picture and url added and it looks much better. Thx.
Either the people in !steamdeck@lemmy.world are pretty horny or its an artifact of the dimensionality reduction and means nothing.
Edit: Actually it could also be that it just didn’t collect enough data on that community and the most recent person was also active in nsfw communities. I was only able to get back 14ish days in the data for lemmy.world. They produce way to many comments and I got kicked out early.
Yeah pretty much. I wanted to see communities that had similar people that commented because I thought that would be a good way to see if there were similar kinds of discussions were happening in those communities.
For example most of the red dots to the top right are nsfw communities and it was able to clump like that because the people that comment in those communities tend to comment in the other nsfw communities as well.
edit: left -> right
I didn’t measure activity for this map. Each dot represents a community. I only used the communities that were on the top 35 instances (except lemmings.world which it couldn’t grab any comments for.)
Well I used dimensionality reduction to make it 2D so the axes are how the algorithm chose to compress it.
The original data had each data point as a community and the features as a frequency of a user posting in that community.
There is actually already a website where people just recreated the bee movie by hand so idk it might actually work as a legal argument.
A few but none that were as good at collecting up to date episodes.
I know I was talking about how the map I linked to worked which is based on reddit.
Good communities, insightful posts, etc.
Long distances actually don’t really mean much it can’t be guaranteed that they actually correlate to much. It is mostly the local groups that are conserved and a bit of the global structure.
Anti Commercial-AI license (CC BY-NC-SA 4.0)