Database
Lightfeed's database provides a structured way to store, query, and analyze your extracted web data. This central repository makes it easy to work with data from multiple sources while maintaining consistency and tracking changes over time.
Key Characteristics
- Structured storage: Organizes extracted web data in a consistent table format
- Deduplication: Updates existing records instead of creating duplicates
- Automatic updates: Refreshes based on your defined schedule
- Change tracking: Preserves data history and highlights changes between extractions
- Custom views: Supports filters and search queries for data analysis
How It Works
At the scheduled time, Lightfeed crawls the specified sources, extracts data following your defined prompt, and structures it according to your schema. This structured data is then stored in the database table following deduplication logic, which updates existing records rather than creating duplicates when a record with the same ID field is encountered again.
Setup
When setting up a Lightfeed database, you'll need to configure these three essential components:
- Prompt: The instruction that guides what data to extract from your sources
- Schema: The defined structure and format for organizing your extracted data
- Schedule: The frequency of automated data updates
The setup applies to all sources added to that database. This ensures consistency across your data collection while giving you flexibility to modify extraction parameters as your needs evolve.
Database View
The database view is your central dashboard for managing extracted data. It provides a comprehensive view of:
- All data extracted from your defined sources
- When each record was last updated
- Changes to existing records and their change history
- Custom filters and search queries for data analysis