From 8e536b97178394afebe5ff069a2f0e5578153251 Mon Sep 17 00:00:00 2001 From: unclecode Date: Sun, 12 May 2024 12:41:42 +0800 Subject: [PATCH] `chore: Refactor README.md and project structure` --- README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index cf52e632..8a49e1e4 100644 --- a/README.md +++ b/README.md @@ -10,11 +10,11 @@ Crawl4AI is a powerful, free web crawling service designed to extract useful inf ## 🚧 Work in Progress 👷‍♂️ -- 🔧 Separate Crawl and Extract JSON Semantic Chunk: Enhancing flexibility and efficiency in large-scale web crawling tasks. -- 🔍 Colab Integration: Exploring integration with Google Colab for easy experimentation in a collaborative notebook environment. -- 🎯 XPath and CSS Selector Support: Adding support for selective retrieval of specific elements from web pages. -- 📷 Image Captioning: Incorporating image captioning capabilities to extract meaningful descriptions from images. -- 💾 Embedding Data Generation and Storage: Developing functionalities to generate and store embedding data for each crawled website. +- 🔧 Separate Crawl and Extract Semantic Chunk: Enhancing efficiency in large-scale tasks. +- 🔍 Colab Integration: Exploring integration with Google Colab for easy experimentation. +- 🎯 XPath and CSS Selector Support: Adding support for selective retrieval of specific elements. +- 📷 Image Captioning: Incorporating image captioning capabilities to extract descriptions from images. +- 💾 Embedding Vector Data: Generate and store embedding data for each crawled website. - 🔍 Semantic Search Engine: Building a semantic search engine that fetches content, performs vector search similarity, and generates labeled chunk data based on user queries and URLs. For more details, refer to the [CHANGELOG.md](https://github.com/unclecode/crawl4ai/edit/main/CHANGELOG.md) file.