Web Scraping on Airbnb: The Best Way to Get Listing Data Fast

WebScrapingAPI
6 min readApr 7, 2021

Have you ever found yourself in the situation of trying to discover the perfect place to spend the holiday? Or maybe you just want to find out how your listing competes with your neighbours’. Either way, why not use the power of web scraping to do this?

A web scraper is a piece of software that helps you automate the tedious process of collecting useful data from third-party websites. Most of the online services offer developers access to an API in order to easily read information from their website. Unfortunately, Airbnb is not one of them. This is where web scrapers come into play.

Why would anyone scrape Airbnb data?
Extracting data with a web scraping API
1. Inspecting the source code
2. Choosing a web scraper
3. Setting up the project
4. Making the request
5. Getting the data in JSON format
The power of web scraping

Why would anyone scrape Airbnb data?

Airbnb is a platform that gives people the opportunity to rent their properties using just an internet connection. It was founded in 2008 by Brian Chesky, Nathan Blecharczyk, and Joe Gebbia and it has seen massive success even during the pandemic.

Everyone can find the listings on the platform just by accessing Airbnb and searching for a place, but there’s no easy way to find a meaningful dataset with the following information:

  • How many listings are in a city?
  • How are they priced?
  • What do they look like?
  • How are they rated?

Of course, you have your own reasons for why you would want to get this information and I’m sure we can help with this.

Let’s get started!

Extracting data with a web scraping API

To be able to extract all the necessary data, make sure you follow the next steps.

1. Inspecting the source code

Check out the elements you are interested in scraping off Airbnb’s website. By right-clicking anywhere on the page and selecting the “Inspect” option you will get to see the Developer Tools.

Let’s say we want to get the price, image, type, and rating of the places we will scrape.

First, we are going to find the common element in the DOM. It seems like _gigle7 is what we are looking for.

2. Choosing a web scraper

To get the best results, we recommend using our service, WebScrapingAPI, as this is the one we will base our tutorial on. You can give it a try for free by accessing this link. Create an account and get back to this page after you are done.

After you have logged in, go to the dashboard page. Here, you can find your private API access key, which we’ll use to make the requests, the API playground where you can test out our product, and the documentation.

3. Setting up the project

After you have created a folder for the project, run the following commands:

npm init -y

npm install got jsdom

To make the requests we are going to install the got module, and for our HTML parsing needs, we will use the jsdom package.

Create a new file called “index.js” and open it up.

4. Making the request

Let’s set the parameters and make the request and parse the HTML. Write the following lines in the previously created file:

const {JSDOM} = require("jsdom")
const got = require("got")

(async () => {
const params = {
api_key: "YOUR_API_KEY",
url: "https://www.airbnb.com/s/Berlin/homes?tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=april&flexible_trip_dates%5B%5D=may&flexible_trip_lengths%5B%5D=weekend_trip&date_picker_type=calendar&source=structured_search_input_header&search_type=filter_change&place_id=ChIJAVkDPzdOqEcRcDteW0YgIQQ&checkin=2021-04-01&checkout=2021-04-08"
}

const response = await got('https://api.webscrapingapi.com/v1', {searchParams: params})
const {document} = new JSDOM(response.body).window

const places = document.querySelectorAll('._gig1e7')

})()

As we have previously stated, all the relevant information can be found under the _gigle7 element, so we are going to fetch all the elements that are assigned the _gigle7 class. You can log it to the screen by adding a console.log() action just right after the line where we define the places constant.

console.log(places)

5. Getting the data in JSON format

From here on, we will dig deeper to get the specific elements containing the price, image type, and rating information.

After the previously presented lines of code, copy the following:

const results = []

places.forEach(place => {

if (place) {
const price = place.querySelector('._ls0e43')
if (price) place.price = price.querySelector('._krjbj').innerHTML

const image = place.querySelector('._91slf2a')
if (image) place.image = image.src

const type = place.querySelector('._b14dlit')
if (type) place.type = type.innerHTML

const rating = place.querySelector('._10fy1f8')
if (rating) place.rating = rating.innerHTML

results.push(place)
}

})

console.log(results)

As you can see, for each listing we get on the first page, we fetch the price tag element, the image source location, the type of the listing, and the rating. In the end, we will have an array of objects, and each of them will contain every element in this list.

Now that we have written all the code necessary to scrape the Airbnb information, the index.js file should look something like this:

const {JSDOM} = require("jsdom");
const got = require("got");

(async () => {
const params = {
api_key: "YOUR_API_KEY",
url: "https://www.airbnb.com/s/Berlin/homes?tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=april&flexible_trip_dates%5B%5D=may&flexible_trip_lengths%5B%5D=weekend_trip&date_picker_type=calendar&source=structured_search_input_header&search_type=filter_change&place_id=ChIJAVkDPzdOqEcRcDteW0YgIQQ&checkin=2021-04-01&checkout=2021-04-08"
}

const response = await got('https://api.webscrapingapi.com/v1', {searchParams: params})

const {document} = new JSDOM(response.body).window

const places = document.querySelectorAll('._gig1e7')
const results = []

places.forEach(place => {

if (place) {
const price = place.querySelector('._ls0e43')
if (price) place.price = price.querySelector('._krjbj').innerHTML

const image = place.querySelector('._91slf2a')
if (image) place.image = image.src

const type = place.querySelector('._b14dlit')
if (type) place.type = type.innerHTML

const rating = place.querySelector('._10fy1f8')
if (rating) place.rating = rating.innerHTML

results.push(place)
}

})

console.log(results)

})()

As you can see, scraping Airbnb data using WebScrapingAPI it’s pretty simple.

  1. Make a request to WebScrapingAPI using the necessary parameters: the API key and the URL we need to scrape data from.
  2. Load the DOM using JSDOM.
  3. Select all the listings by finding the specific class.
  4. For each listing, get the price tag, image, listing type, and rating.
  5. Add every place to a new array called results.
  6. Log the newly created results array to the screen.

The response should look something like this:

[
HTMLDivElement {
price: '$47 per night, originally $67',
image: 'https://a0.muscache.com/im/pictures/miso/Hosting-46812239/original/c56d6bb5-3c2f-4374-ac01-ca84a50d31cc.jpeg?im_w=720',
type: 'Room in serviced apartment in Friedrichshain',
rating: '4.73'
},
HTMLDivElement {
price: '$82 per night, originally $109',
image: 'https://a0.muscache.com/im/pictures/miso/Hosting-45475252/original/f6bd7cc6-f72a-43ef-943e-deba27f8253d.jpeg?im_w=720',
type: 'Entire serviced apartment in Mitte',
rating: '4.80'
},
HTMLDivElement {
price: '$97 per night, originally $113',
image: 'https://a0.muscache.com/im/pictures/92966859/7deb381e_original.jpg?im_w=720',
type: 'Entire apartment in Mitte',
rating: '4.92'
},
HTMLDivElement {
price: '$99 per night, originally $131',
image: 'https://a0.muscache.com/im/pictures/f1b953ca-5e8a-4fcd-a224-231e6a92e643.jpg?im_w=720',
type: 'Entire apartment in Prenzlauer Berg',
rating: '4.90'
},
HTMLDivElement {
price: '$56 per night, originally $61',
image: 'https://a0.muscache.com/im/pictures/bb0813a6-e9fe-4f0a-81a8-161440085317.jpg?im_w=720',
type: 'Entire apartment in Tiergarten',
rating: '4.67'
},
...
]

One of the limitations that we are currently facing is that we only scrape the information from one page of our search. This can be fixed by using some type of a headless browser, like Puppeteer, or a browser automation tool like Selenium. This will help us do most of the things we can manually do in a web browser, like completing a form or clicking a button.

We encourage you to check out our Ultimate Guide to Web Scraping with JavaScript and Node.Js if you want to find more about these technologies.

The power of web scraping

As you can see, we have managed to build a basic web scraper in just a couple of minutes. From here on, your imagination is the limit. If you’re ambitious enough you can even use the data you gather to visualize the distribution and concentration of the properties on a map. Airbnb provides you all the necessary information in the head of any listing page.

As you could see, web scraping can be one of the most fun ways to spend your time as a software developer. You can easily fetch all the data you need to create a new application for a specific niche or just to train your skills. If you have any issue with the process, don’t hesitate to ask for help in the comments section and we will be glad to help!

If this article hasn’t helped you fully understand the capabilities of web scraping you can take a look at another example of how businesses can build a web scraper step by step.

Thank you for your time! Happy scraping!

--

--

WebScrapingAPI
WebScrapingAPI

Written by WebScrapingAPI

Tips, guides, product stories, and anything in between. Discover the web scraping world with us! https://webscrapingapi.com

No responses yet