Twitter / X Tasks
Twitter / X-related Parsers can be imported from image_crawler_utils.stations.twitter. Check out its documentation for more details.
Twitter / X Tips
The main station URL of Twitter / X is: https://x.com/
Note
Parsers of Twitter / X will not use the page_num parameter, as no gallery pages exist in Twitter / X.
image_num will be used by Parsers of Twitter / X, but actual image number may exceed this parameter.
Twitter / X Parsers will only determine whether the number of images has exceeded
image_numafter a Twitter / X searching result page is fully scanned, or a batch of Twitter / X searching result pages are fully scanned.
Important
If you want to download restricted contents, please configure your account settings before using Parsers.
About TwitterKeywordMediaParser
For the meaning of parameters and attributes, check out the documentation of TwitterKeywordMediaParser.
If you want to set advanced searching options, check out the documentation of TwitterSearchSettings. It can be passed in to the twitter_search_settings parameter.
About TwitterUserMediaParser
For the meaning of parameters and attributes, check out the documentation of TwitterUserMediaParser.
Example Program of TwitterKeywordMediaParser
This example will download images with keyword (hashtags) “#クオン”, “#イラスト” and “#うたわれるもの” which are sent before 2024/01/01 and only includes tweets with media into the Twitter folder. After you logging in to Twitter / X, the cookies will be saved at Twitter_cookies.json for later use.
from image_crawler_utils import Downloader
from image_crawler_utils.stations.twitter import TwitterKeywordMediaParser, TwitterSearchSettings, get_twitter_cookies
cookies = get_twitter_cookies(
proxies={"https": "socks5://127.0.0.1:7890"} # If you do not use system proxies, set the proxies manually.
)
cookies.save_to_json("Twitter_cookies.json") # Save it to an JSON file for later use
parser = TwitterKeywordMediaParser(
standard_keyword_string='#クオン AND #イラスト AND #うたわれるもの',
twitter_search_settings=TwitterSearchSettings(
only_media=True,
ending_date='2024-01-01',
),
cookies=cookies,
)
image_info_list = parser.run()
downloader = Downloader(
store_path='Twitter',
image_info_list=image_info_list,
)
downloader.run()
Example Program of TwitterUserMediaParser
This example will download media images of user “idonum” from 2023/1/1 to 2024/1/1 into the Twitter_user_idonum folder. After you logging in to Twitter / X, the cookies will be saved at Twitter_cookies.json for later use.
from image_crawler_utils import Downloader
from image_crawler_utils.stations.twitter import TwitterUserMediaParser, get_twitter_cookies
cookies = get_twitter_cookies(
proxies={"https": "socks5://127.0.0.1:7890"} # If you do not use system proxies, set the proxies manually.
)
cookies.save_to_json("Twitter_cookies.json") # Save it to an JSON file for later use
parser = TwitterUserMediaParser(
user_id="idonum",
cookies=cookies,
starting_date='2023-1-1',
ending_date='2024-1-1',
)
image_info_list = parser.run()
downloader = Downloader(
store_path='Twitter_user_idonum',
image_info_list=image_info_list,
)
downloader.run()
Example ImageInfo of Twitter / X
This example image information (Twitter status 1622634960543973376) is from this Twitter / X status (tweet). Its ImageInfo structure in JSON is like:
CLICK HERE TO DISPLAY
{
"url": "https://pbs.twimg.com/media/FoTAPazaUAIXFk4?format=jpg&name=orig",
"name": "1622634960543973376.jpg",
"info": {
"status_url": "https://x.com/InarikoNkoNa/status/1622634960543973376",
"status_id": "1622634960543973376",
"user_id": "InarikoNkoNa",
"user_name": "こんこんいなり",
"time": "2023-02-06T16:34:56.000Z",
"reply_num": 0,
"retweet_num": 45,
"like_num": 135,
"view_num": 3430,
"text": "#うたわれるもの \n\nたくさん食べるクオンがかわいい\nhttps://pixiv.net/artworks/105154125…",
"hashtags": [
"#うたわれるもの"
],
"links": [
"https://t.co/cqC48ELZxW"
],
"media_list": [
{
"link": "https://x.com/InarikoNkoNa/status/1622634960543973376/photo/1",
"image_source": "https://pbs.twimg.com/media/FoTAPazaUAIXFk4?format=jpg&name=orig",
"image_id": "FoTAPazaUAIXFk4",
"image_name": "1622634960543973376.jpg"
}
]
},
"backup_urls": []
}