Skip to main content

Archive tweets from the command line

Project description

twarc

twarc is a command line tool and Python library for collecting and archiving Twitter JSON data via the Twitter API. It has separate commands (twarc and twarc2) for working with the older v1.1 API and the newer v2 API and Academic Access (respectively). It also has an ecosystem of plugins for doing things with the collected data.

See the twarc documentation for running commands: twarc2 and twarc1 for using the v1.1 API. If you aren't sure about which one to use you'll want to start with twarc2 since the v1.1 is scheduled to be retired.

Install

If you have python installed, you can install twarc from a terminal (such as the Windows Command Prompt available in the "start" menu, or the OSX Terminal application):

pip3 install twarc

Once installed, you should be able to use the twarc and twarc2 command line utilities, or use it as a Python library - check the examples here for that.

Other Tools

Twarc is purpose build for working with the twitter API for archiving and studying digital trace data. It is not built as a general purpose API library for Twitter. While the primary use is academic, it works just as well with "Standard" v2 API and "Premium" v1.1 APIs.

For a list of general purpose Twitter Libraries in different languages see the Twitter Documentation. For Python, TwitterAPI and tweepy are both up to date and maintained. They also support v2 APIs, and their data format with expansions may differ from twarc. There is also a reference implementation of the v2 Academic Access Search and v1.1 Premium Search from Twitter here. The v2 version of this script is compatible with twarc.

For R there is academictwitteR. Unlike twarc, it focuses solely on querying the Twitter Academic Research Product Track v2 API endpoint. Data gathered in twarc can be imported into R for analysis as a dataframe if you export the data into CSV using twarc-csv.

Getting Help

Check the tutorials to get started, or follow along with this recorded stream introducing twarc. If you run into trouble, feel free to make a post on the Twarc Repository or on the Twitter Developer Forums.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twarc-2.10.4.tar.gz (58.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

twarc-2.10.4-py3-none-any.whl (59.8 kB view details)

Uploaded Python 3

File details

Details for the file twarc-2.10.4.tar.gz.

File metadata

  • Download URL: twarc-2.10.4.tar.gz
  • Upload date:
  • Size: 58.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for twarc-2.10.4.tar.gz
Algorithm Hash digest
SHA256 f26d52f4aee133ab8fee516dca4a97ddc0c42fa5240db4be53cd52dd48b6a67b
MD5 1fd33014697d0ca8a4128d0545788f7e
BLAKE2b-256 46a271696490c3d7318554a56200f3af3dd307996476dffd5239a001889e9ba7

See more details on using hashes here.

File details

Details for the file twarc-2.10.4-py3-none-any.whl.

File metadata

  • Download URL: twarc-2.10.4-py3-none-any.whl
  • Upload date:
  • Size: 59.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for twarc-2.10.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4619dd23e659ad5647f715ec0fb8257e282c308ca6d36bb496becf41e751299b
MD5 72b226430f3962cd163e2651142be0e5
BLAKE2b-256 a26bb9ee51746fd4187c222a9e240ac4baf2b317ff00788b8353c69b0dc6ce35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page