Skip to main content

Extract websites to your knowledge base

· 2 min read
Co-founder of Lightfeed
Andrew Zhong
Co-founder of Lightfeed

Welcome! Lyla from Lightfeed here. We are super excited to release knowledge base and dashboard - helping you gather and maintain web data easily from hundreds to thousands of websites.

Introducing knowledge base

Knowledge base is your custom database to extract website data at scale. We've been improving it for the last two months - making it able to extract any public website at the frequency you choose.

Just define the data format you need from each website. Lightfeed will extract, deduplicate and index data into your knowledge base continuously. It can be configured to run every hour or every day.


Lightfeed works for any public websites. Unlike traditional web scraping that depends on brittle selectors, Lightfeed use AI (large language models) to understand and reason about the entire page in order to extract and search. It can extract relevant content from anywhere on the page and is robust to site design changes.

Index website

Better data visibility into dashboard

At Lightfeed, we want to make sure you can easily view your website data and workflow results all in one place. So we redesigned the Dashboard and made it accessible to every data menu in Lightfeed.

You can now filter by source or by time. It is powerful to only see results in the last month, year or all time.

Dashboard filter

Table view is here. You can now apply filter from individual columns and export results to CSV.

Dashboard table

Other improvements

  1. New API-based pricing plan is released. No longer hard limit on number of websites to index.
  2. Launched Onboarding UI to guide new users to create their first knowledge base and workflow.
  3. We now support infinite scrolling during scraping, giving more comprehensive results on Zillow/Redfin.
  4. We expanded our extraction context to 128K tokens (~100k words) per web page, providing up to four times more results at the same cost.