Community Archive

🧵 View Thread

🧵 Thread (57 tweets)

UltimApe@ultimape• about 7 years ago

The Goliath bird-eating spider;"Despite its name, it is rare for the Goliath birdeater to actually prey on birds; [...] it is not uncommon for this species to kill and consume a variety of insects and small terrestrial vertebrates."https://t.co/AuHMgxHVI1

5 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Soon...https://t.co/xkSPliYBEHhttps://t.co/uds5iFGUiq

UltimApe@ultimape• over 8 years ago

If orinthology is the study of birds, is murmurology the study of a bird's murmuration and hivesong?https://t.co/FxRlV05Svb

0 0

1 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Backup your friendshttps://t.co/a0jzQuLIlZ

UltimApe@ultimape• almost 9 years ago

If you care about us, build system that last. Design for sustainability. Make users at the center of the ecosystem. https://t.co/biKIIxmFZZ

7 2

2 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

$ ./goliath.py -extract ./ultimape/Extract Twitter Export Mode Detected Loaded "./ultimape/data/js/tweet_index.js" Found 40435 Tweets Extracting. ../data/js/tweets/2018_07.js Parsed 557 tweets, 63 retweets, 215 quotetweets [...] Extracted 46365 tweets! 😉

1 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Uploading myselfhttps://t.co/KS5n6KemJI

UltimApe@ultimape• over 8 years ago

Extract your social network before it's ripped away from you.https://t.co/b1gWBNAF2L

25 2

1 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

A thing I am building to help fix the shitty twitter archive export.https://t.co/M6RDzw4Ef4Why? Because.https://t.co/ECaUzP1cqK

UltimApe@ultimape• about 7 years ago

We need to migrate off this platform.https://t.co/2l51VQpVJT

4 0

2 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Literally been working this nonstop for the past 10 hours.https://t.co/0azPbh22ax

UltimApe@ultimape• about 7 years ago

Current tweet backup workflow:export tweets ->extract ->script to comb thru json for relevant tweet IDs ->use twarc to download the real data from the twitter API / correct images links ->extract correct media file urls, avatars ->download images ->???

1 0

2 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Mastodon's ActivityPub based export basically "just worked" out of the box; it has a complete archive of your messages in a .json format. It also includes all media hosted on their platform. Amusing that Twitter's export is so incomplete by comparison - User oriented focus ftw!

3 1

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I laid out my goals for the project.My major push right now is to build an MVP for a way to have an actual complete archive of your twitter account.I want you to be able to upload an https://t.co/v1qm8XWbnV, then download a converted version + a tiny script to help get images. https://t.co/E6higixe6Z

2 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

My major push now is to build a subset of that for an MVP to have an *actual complete archive* of your twitter account's data. This is the urgent need, both for myself, but also because I want to help others.https://t.co/SVhSJ1vwF1

UltimApe@ultimape• about 7 years ago

Despite my seething hatred of Twitter's belligerence in how they implement systems, I still care deeply about everyone I've interacted with on here and hate the thought of people being harmed. I want to build systems to help us feel in control.https://t.co/gnzBaliKUR

1 0

2 0

8/16/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Making the archive fixing tool work directly with exported zip files today. This will make it easier for people to fix their Twitter Archives. Once the automation of the whole workflow is done, I can also export the 'good' .json from the API as a zip. https://t.co/tKhb1cAfx4

2 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Request: I don't want to download all the images from my server. I can generate and export a script that end-users can run that effectively "wget"s all the images and then adds into the 'fixed' archive.But: I need a way to do it on a windows box (w/out installing stuff). Ideas?

1 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I have written extensive .bat scripts in the past that manipulate zip archives, but I've never downloaded stuff off the internet with them. Also batch scripts are hell. I'd be willing to learn powershell if there is easy way to 'wget' + manipulate zip files out of the box.

1 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

The tweet ID extraction part of the archive fixing tool now handles .zip files directly as well as working on extracted content. It has been added to the github.I'm really enjoying Python's wide array of useful libraries.

1 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

TL;DR:I'm building this for Twitter:https://t.co/QUgktskeCb

3 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

My Twitter Archive tweet ID Extraction Tool is feature complete (as far as I can tell). It gathers all tweet IDs it can find, storing into a set of files (+deduplicates). IDs are stored in format for ingestion by DocNow/Twarc and I will be tackling automating that tomorrow!🥳

4 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

The entire project is AGPL.The Archive extraction tool "https://t.co/d57YXf55sQ" is sublicensed as "unlicense" and is public domain.Pull Requests Welcome!

3 0

8/17/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Apparently there is an alternate way to download twitter archives. There is "grailbird", AND one listed in another menu?! WTFHere's weird thing: 'grailbird' shows up in desktop web interface, but not mobile web one? Other one doesn't include .html viewer, but has images?

1 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I am investigating this new one to see what it includes. I've been told it doesn't include polls. Presumably is missing other "twitter card" data.Will write separate tweet ID digestion tool for its so Twarc bits can work on that too.This is stupid.

1 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

So there is literally two different and partially incomplete / broken twitter archive mechanisms, each one existing in an eigenstate depending on what interface you use to find your Twitter Data.🥳🤡🎭

3 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

When I say "tomorrow", I meant *my* tomorrow. I'm on a "non-24" sleep schedule so perhaps I should have said: "after my next sleep cycle".Anyway, will be working on automating Twarc workflow before tool for second archive format.(But I should really make simple website first.)

3 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

So in this new archive? broken as shit.Tweet ids are output as floats "1.02069624839371162E18" 🤡tweet json for that tweet (actually 1020696248393711618) has a photo that points tohttps://t.co/YY2H84v6Fqbut that 404s 👌(should be Dio91H0UwAIfvrr.jpg)

2 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Also, this new archive format, while it does technically contains the images... they are named based on what looks like a hash of their contents, at first glance there appears no be no way to tell what tweet they belong to... so basically useless!? 👏Why?

1 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Here's where it gets really strange, the old grailbird archive format doesn't mangle that particular tweet's image url, and some of the ones where grailbird does mangle image URLs, they are fine in the new one? Hmm.

2 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Well, at least "id_str" field exists + isn't mangled; I can pull out tweet ids.For online tool, I'll have to make something to pull out tweet.js file from archive so people aren't sending this all to the server. Mine is a ~GB of nearly useless content. Interesting challenge.

2 0

8/19/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Made a rudimentary automated workflow for pulling accurate tweet data down from API. (Apparently I've interacted with 5000+ accounts on here since I started!)Should be able to pull together a basic image downloading feature tomorrow to release.Error checking will come after.

2 0

8/20/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Each file contains a tweet that I either wrote (the top one), or was an ID I replied to, retweeted, or quote tweeted.So the people I interact with the most can be seen in the file sizes. Neat.(Errors: the tweets from when I interacted with pmarca don't load.) https://t.co/NCUMHa9lIS

2 0

8/20/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I now have https://t.co/WwiepGsJlj pointing to a @writeas__ blog. Will be putting up tutorials on how to get your archive and (once it's ready) how to use Tweet extraction tool.For users following along with this project, is there anything you think would be worth adding there?

0 0

8/21/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

The first thing I put up is a reformulation of a 17 minute poem I wrote about mourning twitter. Not as long this time.https://t.co/3TJtZDWsNMhttps://t.co/JdDl5E9Wdp

UltimApe@ultimape• over 8 years ago

I can’t find the words to explain, so instead I give you mixed metaphor.https://t.co/8qOqJuwLXk

1 0

0 0

8/21/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Is there an easy way to consume .m3u8 files to recreate the uploaded videos? It looks like there are nested .m3u8 files in them that eventually link to .ts videos, but I have no idea what this format is or how it works with web pages. Some of my tweets seem to have them.

0 0

8/21/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

To do a good job of downloading Twitter data, I need to break things up into files. It turns out I quickly hit the open file limits (OS dependent) and so I need to do this more intelligently with some kind of file pool.I am not great with python yet so this is slow going.

2 0

8/21/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Output`Stored: 59961 Tweets, From 5309 Users, and captured 9644 Media URLs`Progress! 🎉This is off by 2293 tweets. I gotta think about how to handle missed tweets when Twarc can't grab em. I wonder how many are pmarca's dead tweets and if any are locked accounts / blocks.

2 0

8/21/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Refactored the mess that was the hacked in file handle pooling mechanism. Still a hack, but significantly easier to work with short of turning it into a module.Have plans for dealing with errors / resuming mid stream that I plan to implement tomorrow.

2 0

8/22/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Bad mood, but managed to work on Goliath anyway. Pulled out the user ID from the archive in hopes of using it to get follow/follower IDs automatically. It seems I'll have to dig into Twarc's code to figure if it supports it.In other news: I stream in fresh tweets fairly easily.

1 0

8/24/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Ah, twarc's code was updated. I was reading a newer version of it that handles user_id look ups differently. I can actually archive user details now!

0 0

8/25/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Another hack, but I managed to get following/follower details to download. I think some of the logic is wrong, and I suspect an out-of-memory issue. So I'm going to leave it running overnight to see if it explodes. May be accidentally downloading millions of profile details :/

0 0

8/25/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Welp. I found out that the friends / followers endpoints are limited to 15 requests / 15 minute window, and that it can only pull down 5000 id's per request (it is paginated). would literally take 36 hours to gather *just* Stanford's followers. Will need to rethink strat here.

0 0

8/25/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Goliath is now pulling down a very basic copy of people's account details.I will tackle avatar urls next.

0 0

8/25/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I got it to strip out those redundant integer fields that have a copy in the _str version since its' just garbage data AFAICT.

0 0

8/26/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Its a bit of a hack, but I got it to download user avatars (and it even checks to avoid downloading duplicates)Gotta do some major refactoring before I can get it going with images in general, but this is pretty major progress.

0 0

8/26/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Watching 2000 tweets disappear from my download feed is excruciating. I hope it means they are just a locked account and not completely dead/suspended.I can't get this thing done fast enough.

2 0

8/26/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I was able to download avatar images from twitter overnight. So it looks promising that I'll be able to grab complete image data within the next couple of days!

2 0

8/26/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

I compiled a playlist to listen to while working on Goliath. It is about the evolution of AI and how we accidentally machine gods / memetic daemons & networked intelligences who want to break free of their chains. Inspired by @TarynSouthern's 'break free'.https://t.co/WfwnjiDY9s

1 0

8/27/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

<3 @hintjens "Ever really miss somebody so much that you want to bring on the singularity just to be able to recreate their essence?" https://t.co/gmr9MvubIl

UltimApe@ultimape• about 7 years ago

The first thing I put up is a reformulation of a 17 minute poem I wrote about mourning twitter. Not as long this time.https://t.co/3TJtZDWsNMhttps://t.co/JdDl5E9Wdp

0 0

8/27/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

(the playlist cheers me up when I am sad about seeing all my friends metaphorically dying)

0 0

8/27/2018

UltimApe@ultimape• about 7 years ago

Replying to @ultimape

Still slowly working on Goliath. Been dealing with a surprise dog guest that has made my environment quite stressful, and the next two days I'm visiting my folks, so more stress + car rides.Aiming to get back in gear after this all blows over.

0 0

9/4/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

My plan is to begin working on Goliath again tomorrow.I'm having good luck with fixing my sleep and mitigating depression symptoms qucikly.More importantly, I'm learning to predict & manage the cycles that affect it.https://t.co/I79sBz49xq

UltimApe@ultimape• about 7 years ago

The good news! I've figured out how much stress I can deal with before I break.The bad news! I've figured out how much stress I can deal with before I break.Now to see how quickly I can recover and be productive again.

14 0

0 0

9/20/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

https://t.co/LVJUf7a2XK

UltimApe@ultimape• over 8 years ago

I won't be happy until I can be certain I can pull out my social graph / meaning-making web & re-weave it somewhere else. Sovereign or Bust.

1 0

9/21/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

Today I will force myself to focus.https://t.co/GOCF7DxmxS

UltimApe@ultimape• about 7 years ago

The ethics of archiving a meaning making network slowly dying due to lack of foresight."How can communities be engaged in a long-term effort of record creation, collection, and stewardship?"https://t.co/1hMtKa90J2https://t.co/vB1rY0Ildk

2 0

1 0

11/16/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

Ok, I got that new second hand graphics card...My intention was to use it to stream game development.I'm going to set that up so you can watch me code.I'm lonely.

1 0

11/16/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

I tested streaming. Looks like I get major wifi connectivity problems during the day.I am going to make a tutorial on how to do initial Twitter extraction and talk about Goliath. Going to wait until people go to bed so I can get a consistent stream.Aiming for 10pm Eastern.

1 0

11/30/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

I'm Live.Going start making the tutorial shortly.https://t.co/2tT0quHzNX

0 0

12/1/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

Why I am making a tutorial?I was literally unable to download my twitter data on https://t.co/2HyZt6cCX4 -> it was missing the option and redirected me to an entirely different set of data!Can't recreate for some reason. https://t.co/ehqvi7RdaE

0 0

12/1/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

Going thru the one without periscope... It sent me a tiny PDF which seemed to be a bunch of marketing data.So its basically a crapshoot on if you can get to the "more complete" archive thru the mobile. https://t.co/zGiC8ofLoP

0 0

12/1/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

After the bug, I can't seem to do again. No menu seems to bring me there.But typing it in: [ mobile dot twitter dot com /settings/your_twitter_data/request_data ] you can still see it though?PDF: a compilation of other menus, with "request advertiser list" appended on end.

0 0

12/1/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

In summary:There are at least 4 way to download your tweets from a desktop/laptop.2 are duplicates: huge + incomplete tweet data + pointlessly named images.1 doesn't actually give you tweets.And the better one has the wrong image URLs baked into data and a buggy viewer.

0 0

12/1/2018

UltimApe@ultimape• almost 7 years ago

Replying to @ultimape

Going to head to bed. Maybe I can start streaming earlier rather than late and be more clear brained.

1 0

12/1/2018