refactor: Remove image format dot in get_content_of_website_optimized

The code change removes the dot from the image format in the `get_content_of_website_optimized` function. This change ensures consistency in the image format and improves the functionality.
This commit is contained in:
unclecode
2024-07-31 16:15:55 +08:00
parent efcf3ac6eb
commit 40477493d3

View File

@@ -498,6 +498,8 @@ def get_content_of_website_optimized(url: str, html: str, word_count_threshold:
width_value, width_unit = parse_dimension(image_width)
image_size = 0 #int(fetch_image_file_size(img,base_url) or 0)
image_format = os.path.splitext(img.get('src',''))[1].lower()
# Remove . from format
image_format = image_format.strip('.')
score = 0
if height_value:
if height_unit == 'px' and height_value > 150: