修改地图爬取urls,应对dify的列表上限和http超时

This commit is contained in:
2025-12-29 11:29:52 +08:00
parent 79b3f79c15
commit 20317e6788
4 changed files with 25 additions and 8 deletions

View File

@@ -1,4 +1,4 @@
def main(map_json: list[dict], BASE_URL: str):
def main(map_json: list[dict]):
"""
将Firecrawl Map节点的输出转换为干净的输出避免杂七杂八的数据干扰
输入: Firecrawl Map节点的输出结构如下
@@ -17,6 +17,21 @@ def main(map_json: list[dict], BASE_URL: str):
map_obj = map_json[0]
return {
"urls": map_obj["links"],
"code": int(map_obj["success"]),
# "urls": map_obj["links"],
# "code": int(map_obj["success"]),
"urls_obj": {
"urls": map_obj["links"]
}
}
'''
返回值示例
{
"urls_obj": {
"urls": [
"http://example.com/page1",
]
}
}
'''