Instagram Scraping

Getting Instagram profile details using Python

Instagram is a photo and video-sharing social networking service owned by Facebook. In this article, we will learn how can we get Instagram profile details using web scraping. Python provides powerful tools for web scraping, we will be using BeautifulSoup here.

Modules required and Installation:

Requests :
Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs.

pip install requests

Beautiful Soup:
Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.

pip install beautifulsoup4

null

Explanation –

For a given user name data scraping will be done then parsing of data will be done so that output can be readable. The output will be description i.e followers count, following count, count of posts.

Below is the implementation –# importing libraries frombs4 importBeautifulSoup importrequests # instagram URL URL ="https://www.instagram.com/{}/"# parse function defparse_data(s): # creating a dictionary data ={} # splittting the content  # then taking the first part s =s.split("-")[0] # again splitting the content  s =s.split(" ") # assigning the values data['Followers'] =s[0] data['Following'] =s[2] data['Posts'] =s[4] # returning the dictionary returndata # scrape function defscrape_data(username): # getting the request from url r =requests.get(URL.format(username)) # converting the text s =BeautifulSoup(r.text, "html.parser") # finding meta info meta =s.find("meta", property="og:description") # calling parse method returnparse_data(meta.attrs['content']) # main function if__name__=="__main__": # user name username ="geeks_for_geeks"# calling scrape function data =scrape_data(username) # printing the info print(data) Output :

{'Followers': '120.2k', 'Following': '0', 'Posts': '702'}

Leave a Comment

Your email address will not be published. Required fields are marked *