Enhance Crawl4AI with CLI and documentation updates - Implemented Command-Line Interface (CLI) in `crawl4ai/cli.py` - Added chunking strategies and their documentation in `llm.txt`
58 lines
2.9 KiB
Markdown
58 lines
2.9 KiB
Markdown
### Hypothetical Questions
|
||
|
||
1. **General Understanding of the New Caching System**
|
||
- *"Why did Crawl4AI move from boolean cache flags to a `CacheMode` enum?"*
|
||
- *"What are the benefits of using a single `CacheMode` enum over multiple booleans?"*
|
||
|
||
2. **CacheMode Usage**
|
||
- *"What `CacheMode` should I use if I want normal caching (both read and write)?"*
|
||
- *"How do I enable a mode that only reads from cache, or only writes to cache?"*
|
||
- *"What does `CacheMode.BYPASS` do, and how is it different from `CacheMode.DISABLED`?"*
|
||
|
||
3. **Migrating from Old to New System**
|
||
- *"How do I translate `bypass_cache=True` to the new `CacheMode` approach?"*
|
||
- *"I used to set `disable_cache=True`; what `CacheMode` should I use now?"*
|
||
- *"If I previously used `no_cache_read=True`, how do I achieve the same effect with `CacheMode`?"*
|
||
|
||
4. **Implementation Details**
|
||
- *"How do I specify the `CacheMode` in my crawler runs?"*
|
||
- *"Can I pass the `CacheMode` to `arun` directly, or do I need a `CrawlerRunConfig` object?"*
|
||
|
||
5. **Suppressing Deprecation Warnings**
|
||
- *"How can I temporarily disable deprecation warnings while I migrate my code?"*
|
||
|
||
6. **Edge Cases and Best Practices**
|
||
- *"What if I forget to update my code and still use the old flags?"*
|
||
- *"Is there a `CacheMode` for scenarios where I want to only write to cache and never read old data?"*
|
||
|
||
7. **Examples and Code Snippets**
|
||
- *"Can I see a side-by-side comparison of old and new caching code for a given URL?"*
|
||
- *"How can I confirm that using `CacheMode.BYPASS` skips both reading and writing cache?"*
|
||
|
||
8. **Performance and Reliability**
|
||
- *"Will switching to `CacheMode` improve my code’s readability and reduce confusion?"*
|
||
- *"Can the new caching system still handle large-scale crawling scenarios efficiently?"*
|
||
|
||
### Topics Discussed in the File
|
||
|
||
- **Old vs. New Caching Approach**:
|
||
Previously, multiple boolean flags (`bypass_cache`, `disable_cache`, `no_cache_read`, `no_cache_write`) controlled caching. Now, a single `CacheMode` enum simplifies configuration.
|
||
|
||
- **CacheMode Enum**:
|
||
Provides clear modes:
|
||
- `ENABLED`: Normal caching (read and write)
|
||
- `DISABLED`: No caching at all
|
||
- `READ_ONLY`: Only read from cache, don’t write new data
|
||
- `WRITE_ONLY`: Only write to cache, don’t read old data
|
||
- `BYPASS`: Skip cache entirely for this operation
|
||
|
||
- **Migration Patterns**:
|
||
A simple mapping table helps developers switch old boolean flags to the corresponding `CacheMode` value.
|
||
|
||
- **Suppressing Deprecation Warnings**:
|
||
Temporarily disabling deprecation warnings provides a grace period to update old code.
|
||
|
||
- **Code Examples**:
|
||
Side-by-side comparisons show how to update code from old flags to the new `CacheMode` approach.
|
||
|
||
In summary, the file guides developers in transitioning from the old caching boolean flags to the new `CacheMode` enum, explaining the rationale, providing a mapping table, and offering code snippets to facilitate a smooth migration. |