Twitter / X Tasks

Twitter / X-related Parsers can be imported from image_crawler_utils.stations.twitter. Check out its documentation for more details.

Get Twitter / X Cookies

Please refer to the documentation of get_twitter_cookies() for detailed information.

You can save Twitter / X cookies for later use instead of calling this function every run.

Twitter / X Tips

The main station URL of Twitter / X is: https://x.com/

Note

Parsers of Twitter / X will not use the page_num parameter, as no gallery pages exist in Twitter / X.

image_num will be used by Parsers of Twitter / X, but actual image number may exceed this parameter.

Twitter / X Parsers will only determine whether the number of images has exceeded image_num after a Twitter / X searching result page is fully scanned, or a batch of Twitter / X searching result pages are fully scanned.

Important

If you want to download restricted contents, please configure your account settings before using Parsers.

About TwitterKeywordMediaParser

For the meaning of parameters and attributes, check out the documentation of TwitterKeywordMediaParser.

If you want to set advanced searching options, check out the documentation of TwitterSearchSettings. It can be passed in to the twitter_search_settings parameter.

About TwitterUserMediaParser

For the meaning of parameters and attributes, check out the documentation of TwitterUserMediaParser.

Example Program of TwitterKeywordMediaParser

This example will download images with keyword (hashtags) “#クオン”, “#イラスト” and “#うたわれるもの” which are sent before 2024/01/01 and only includes tweets with media into the Twitter folder. After you logging in to Twitter / X, the cookies will be saved at Twitter_cookies.json for later use.

from image_crawler_utils import Downloader
from image_crawler_utils.stations.twitter import TwitterKeywordMediaParser, TwitterSearchSettings, get_twitter_cookies

cookies = get_twitter_cookies(
    proxies={"https": "socks5://127.0.0.1:7890"}  # If you do not use system proxies, set the proxies manually.
)
cookies.save_to_json("Twitter_cookies.json")  # Save it to an JSON file for later use

parser = TwitterKeywordMediaParser(
    standard_keyword_string='#クオン AND #イラスト AND #うたわれるもの',
    twitter_search_settings=TwitterSearchSettings(
        only_media=True,
        ending_date='2024-01-01',
    ),
    cookies=cookies,
)
image_info_list = parser.run()
downloader = Downloader(
    store_path='Twitter',
    image_info_list=image_info_list,
)
downloader.run()

Example Program of TwitterUserMediaParser

This example will download media images of user “idonum” from 2023/1/1 to 2024/1/1 into the Twitter_user_idonum folder. After you logging in to Twitter / X, the cookies will be saved at Twitter_cookies.json for later use.

from image_crawler_utils import Downloader
from image_crawler_utils.stations.twitter import TwitterUserMediaParser, get_twitter_cookies

cookies = get_twitter_cookies(
    proxies={"https": "socks5://127.0.0.1:7890"}  # If you do not use system proxies, set the proxies manually.
)
cookies.save_to_json("Twitter_cookies.json")  # Save it to an JSON file for later use

parser = TwitterUserMediaParser(
    user_id="idonum",
    cookies=cookies,
    starting_date='2023-1-1',
    ending_date='2024-1-1',
)
image_info_list = parser.run()
downloader = Downloader(
    store_path='Twitter_user_idonum',
    image_info_list=image_info_list,
)
downloader.run()

Example ImageInfo of Twitter / X

This example image information (Twitter status 1622634960543973376) is from this Twitter / X status (tweet). Its ImageInfo structure in JSON is like:

CLICK HERE TO DISPLAY

{
    "url": "https://pbs.twimg.com/media/FoTAPazaUAIXFk4?format=jpg&name=orig",
    "name": "1622634960543973376.jpg",
    "info": {
        "status_url": "https://x.com/InarikoNkoNa/status/1622634960543973376",
        "status_id": "1622634960543973376",
        "user_id": "InarikoNkoNa",
        "user_name": "こんこんいなり",
        "time": "2023-02-06T16:34:56.000Z",
        "reply_num": 0,
        "retweet_num": 45,
        "like_num": 135,
        "view_num": 3430,
        "text": "#うたわれるもの \n\nたくさん食べるクオンがかわいい\nhttps://pixiv.net/artworks/105154125…",
        "hashtags": [
            "#うたわれるもの"
        ],
        "links": [
            "https://t.co/cqC48ELZxW"
        ],
        "media_list": [
            {
                "link": "https://x.com/InarikoNkoNa/status/1622634960543973376/photo/1",
                "image_source": "https://pbs.twimg.com/media/FoTAPazaUAIXFk4?format=jpg&name=orig",
                "image_id": "FoTAPazaUAIXFk4",
                "image_name": "1622634960543973376.jpg"
            }
        ]
    },
    "backup_urls": []
}