sponsors: Add thor data as sponsor

2025-12-23 20:45:00 +05:30 · 2025-12-23 16:28:26 +05:30
2 changed files with 3 additions and 47 deletions
--- a/README.md
+++ b/README.md
@@ -1093,6 +1093,7 @@ Our enterprise sponsors and technology partners help scale Crawl4AI to power pro

 | Company | About | Sponsorship Tier |
 |------|------|----------------------------|
+| <a href="https://www.thordata.com/?ls=github&lk=crawl4ai" target="_blank"><img src="https://gist.github.com/aravindkarnam/dfc598a67be5036494475acece7e54cf/raw/thor_data.svg" alt="Thor Data" width="120"/></a>  | Leveraging Thordata ensures seamless compatibility with any AI/ML workflows and data infrastructure, massively accessing web data with 99.9% uptime, backed by one-on-one customer support. | 🥈 Silver |
 | <a href="https://app.nstproxy.com/register?i=ecOqW9" target="_blank"><picture><source width="250" media="(prefers-color-scheme: dark)" srcset="https://gist.github.com/aravindkarnam/62f82bd4818d3079d9dd3c31df432cf8/raw/nst-light.svg"><source width="250" media="(prefers-color-scheme: light)" srcset="https://www.nstproxy.com/logo.svg"><img alt="nstproxy" src="ttps://www.nstproxy.com/logo.svg"></picture></a>  | NstProxy is a trusted proxy provider with over 110M+ real residential IPs, city-level targeting, 99.99% uptime, and low pricing at $0.1/GB, it delivers unmatched stability, scale, and cost-efficiency. | 🥈 Silver |
 | <a href="https://app.scrapeless.com/passport/register?utm_source=official&utm_term=crawl4ai" target="_blank"><picture><source width="250" media="(prefers-color-scheme: dark)" srcset="https://gist.githubusercontent.com/aravindkarnam/0d275b942705604263e5c32d2db27bc1/raw/Scrapeless-light-logo.svg"><source width="250" media="(prefers-color-scheme: light)" srcset="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"><img alt="Scrapeless" src="https://gist.githubusercontent.com/aravindkarnam/22d0525cc0f3021bf19ebf6e11a69ccd/raw/Scrapeless-dark-logo.svg"></picture></a>  | Scrapeless provides production-grade infrastructure for Crawling, Automation, and AI Agents, offering Scraping Browser, 4 Proxy Types and Universal Scraping API. | 🥈 Silver |
 | <a href="https://dashboard.capsolver.com/passport/register?inviteCode=ESVSECTX5Q23" target="_blank"><picture><source width="120" media="(prefers-color-scheme: dark)" srcset="https://docs.crawl4ai.com/uploads/sponsors/20251013045338_72a71fa4ee4d2f40.png"><source width="120" media="(prefers-color-scheme: light)" srcset="https://www.capsolver.com/assets/images/logo-text.png"><img alt="Capsolver" src="https://www.capsolver.com/assets/images/logo-text.png"></picture></a> | AI-powered Captcha solving service. Supports all major Captcha types, including reCAPTCHA, Cloudflare, and more | 🥉 Bronze |
--- a/crawl4ai/async_crawler_strategy.py
+++ b/crawl4ai/async_crawler_strategy.py
@@ -989,53 +989,8 @@ class AsyncPlaywrightCrawlerStrategy(AsyncCrawlerStrategy):
            mhtml_data = None

            if config.pdf:
-                if config.css_selector:
-                    # Extract content with styles and fixed image URLs
-                    content_with_styles = await page.evaluate(f"""
-                        () => {{
-                            const element = document.querySelector("{config.css_selector}");
-                            const clone = element.cloneNode(true);
-                            
-                            // Fix all image URLs to absolute
-                            clone.querySelectorAll('img').forEach(img => {{
-                                if (img.src) img.src = img.src;  // This converts to absolute URL
-                            }});
-                            
-                            // Get all styles
-                            const styles = Array.from(document.styleSheets)
-                                .map(sheet => {{
-                                    try {{
-                                        return Array.from(sheet.cssRules).map(rule => rule.cssText).join('\\n');
-                                    }} catch(e) {{
-                                        return '';
-                                    }}
-                                }}).join('\\n');
-                            
-                            return {{
-                                html: clone.outerHTML,
-                                styles: styles,
-                                baseUrl: window.location.origin
-                            }};
-                        }}
-                    """)
-                    
-                    # Create page with base URL for relative resources
-                    temp_page = await context.new_page()
-                    await temp_page.goto(content_with_styles['baseUrl'])  # Set the base URL
-                    await temp_page.set_content(f"""
-                        <html>
-                        <head>
-                            <base href="{content_with_styles['baseUrl']}">
-                            <style>{content_with_styles['styles']}</style>
-                        </head>
-                        <body>{content_with_styles['html']}</body>
-                        </html>
-                    """)
-                    
-                    pdf_data = await self.export_pdf(temp_page)
-                    await temp_page.close()
-                else:
-                    pdf_data = await self.export_pdf(page)
+                pdf_data = await self.export_pdf(page)
+
            if config.capture_mhtml:
                mhtml_data = await self.capture_mhtml(page)
Author	SHA1	Message	Date
Aravind Karnam	a234959b12	sponsors: Add thor data as sponsor	2025-12-23 20:45:00 +05:30
Aravind Karnam	da82f0ada5	sponsors: Add thor data as sponsor	2025-12-23 16:28:26 +05:30