Skip to main content

Add websites to extract

First let’s set up your knowledge base. Each knowledge base contains a group of websites you want to extract.

Step 1 - Add a new knowledge base

Click "Knowledge bases" on the Lightfeed left panel.

Go to "Knowledge bases"

Click "+ Knowledge base".

Click "Add knowledge base"

Give your knowledge a name, then click “Next”.

Name your knowledge base

Step 2 - Add a website to extract

Click “+ Website” button to add a website you want to extract in this knowledge base.

Add website to extract

Choose websites type from “Gateway website” and “RSS feed”. Choose "Gateway websites" if you are extracting a list of items (e.g. articles, posts, discussions, products) from this website.

Choose website type

Step 3 - Input URL and define data schema

Input website URL. Define data format or schema you want to extract into. By default, we extract title (text), link (URL), description (text, optional).

Input URL and define data schema

You can add new fields to extract (depending on different websites, this can be price, upvotes, author, comment link etc...). For this example, from github.com/trending, I want to extract number of stars and programming language of each project. So I added “stars” and “language” by clicking "+ field".

When all fields are added, click "Add".

Extract custom fields

Step 4 - Extract websites and index into knowledge base

Lightfeed will scrape the website live, extract results into the schema you defined, and index the results to your knowledge base. This will take around 1 minute.

Once a website is added, Lightfeed will extract it every day, auto deduplicate and index new contents every day to keep your knowledge base up-to-date and consistent.

You can review the results that got indexed into the knowledge base.

Review extracted and indexed results

Step 5 - Repeat 2-4 to add more websites

Repeat step 2-4 to extract more websites. Once done, click “Exit” on the right top corner.

Add more websites and exit