当前位置: 首页 > news >正文

网站怎么做二维码链接内网做网站

网站怎么做二维码链接,内网做网站,免费注册电子邮箱,蚌埠 网站制作Langchain 集成 FAISS 1. FAISS2. Similarity Search with score3. Saving and loading4. Merging5. Similarity Search with filtering 1. FAISS Facebook AI Similarity Search (Faiss)是一个用于高效相似性搜索和密集向量聚类的库。它包含的算法可以搜索任意大小的向量集&a…

Langchain 集成 FAISS

  • 1. FAISS
  • 2. Similarity Search with score
  • 3. Saving and loading
  • 4. Merging
  • 5. Similarity Search with filtering

1. FAISS

Facebook AI Similarity Search (Faiss)是一个用于高效相似性搜索和密集向量聚类的库。它包含的算法可以搜索任意大小的向量集,甚至可能无法容纳在 RAM 中的向量集。它还包含用于评估和参数调整的支持代码。

Faiss 文档地址在这里.

本笔记本展示了如何使用与 FAISS 矢量数据库相关的功能。

示例代码,

# !pip install faiss
# OR
# !pip install faiss-cpu
import os
import getpassos.environ["COHERE_API_KEY"] = getpass.getpass("Cohere API Key:")# 如果需要在没有 AVX2 优化的情况下初始化 FAISS,请取消注释以下一行
# os.environ['FAISS_NO_AVX2'] = '1'
# from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.document_loaders import TextLoader

输出结果,

from langchain.document_loaders import TextLoaderloader = TextLoader("./state_of_the_union_en.txt", encoding="utf-8")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)# embeddings = OpenAIEmbeddings
embeddings = CohereEmbeddings()

示例代码,

db = FAISS.from_documents(docs, embeddings)query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
print(docs[0].page_content)

输出结果,

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.

2. Similarity Search with score

有一些 FAISS 特定方法。其中之一是 similarity_search_with_score ,它不仅允许您返回文档,还允许返回查询到它们的距离分数。返回的距离分数是L2距离。因此,分数越低越好。

示例代码,

docs_and_scores = db.similarity_search_with_score(query)
docs_and_scores[0]

输出结果,

(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'}),7172.888)

refer: https://python.langchain.com/docs/integrations/vectorstores/faiss 文档的分数是 0.36913747

    (Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),0.36913747)

还可以使用 similarity_search_by_vector 搜索与给定嵌入向量类似的文档,它接受嵌入向量作为参数而不是字符串。

示例代码,

embedding_vector = embeddings.embed_query(query)
docs_and_scores = db.similarity_search_by_vector(embedding_vector)
docs_and_scores

输出结果如下,

[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \n\nI recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n\nThey were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n\nOfficer Mora was 27 years old. \n\nOfficer Rivera was 22. \n\nBoth Dominican Americans who’d grown up on the same streets they later chose to patrol as police officers. \n\nI spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n\nI’ve worked on these issues a long time. \n\nI know what works: Investing in crime preventionand community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \n\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n\nWhile it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \n\nAnd soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \n\nSo tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  \n\nFirst, beat the opioid epidemic.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='Tonight, I’m announcing a crackdown on these companies overcharging American businesses and consumers. \n\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up.  \n\nThat ends on my watch. \n\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \n\nWe’ll also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \n\nLet’s pass the Paycheck Fairness Act and paid leave.  \n\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \n\nLet’s increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls America’s best-kept secret: community colleges.', metadata={'source': './state_of_the_union_en.txt'})]

3. Saving and loading

您还可以保存和加载 FAISS 索引。这很有用,因此您不必每次使用它时都重新创建它。

示例代码,

db.save_local("faiss_index")
new_db = FAISS.load_local("faiss_index", embeddings)
docs = new_db.similarity_search(query)
docs[0]

输出结果,

Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'})

4. Merging

您还可以合并两个 FAISS 矢量存储。

示例代码,

db1 = FAISS.from_texts(["foo"], embeddings)
db2 = FAISS.from_texts(["bar"], embeddings)
db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db2.docstore._dict

输出结果,

{'8dcb4556-8eb5-43be-9eaa-0bff9a6e7997': Document(page_content='bar', metadata={})}

示例代码,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db1.merge_from(db2)

输出结果,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={}),'8dcb4556-8eb5-43be-9eaa-0bff9a6e7997': Document(page_content='bar', metadata={})}

5. Similarity Search with filtering

FAISS vectorstore 还可以支持过滤,因为 FAISS 本身不支持过滤,我们必须手动执行。这是通过首先获取比 k 更多的结果然后过滤它们来完成的。您可以根据元数据过滤文档。您还可以在调用任何搜索方法时设置 fetch_k 参数,以设置在过滤之前要获取的文档数量。这是一个小例子:

示例代码,

from langchain.schema import Documentlist_of_documents = [Document(page_content="foo", metadata=dict(page=1)),Document(page_content="bar", metadata=dict(page=1)),Document(page_content="foo", metadata=dict(page=2)),Document(page_content="barbar", metadata=dict(page=2)),Document(page_content="foo", metadata=dict(page=3)),Document(page_content="bar burr", metadata=dict(page=3)),Document(page_content="foo", metadata=dict(page=4)),Document(page_content="bar bruh", metadata=dict(page=4)),
]
db = FAISS.from_documents(list_of_documents, embeddings)
results_with_scores = db.similarity_search_with_score("foo")
for doc, score in results_with_scores:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

输出结果,

Content: foo, Metadata: {'page': 1}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 2}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 3}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 4}, Score: 0.018019594252109528

现在我们进行相同的查询调用,但我们仅过滤 page = 1

results_with_scores = db.similarity_search_with_score("foo", filter=dict(page=1))
for doc, score in results_with_scores:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

输出结果,

Content: foo, Metadata: {'page': 1}, Score: 0.018019594252109528
Content: bar, Metadata: {'page': 1}, Score: 10266.8544921875

同样的事情也可以用 max_marginal_relevance_search 来完成。

示例代码,

results = db.max_marginal_relevance_search("foo", filter=dict(page=1))
for doc in results:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

输出结果,

Content: foo, Metadata: {'page': 1}
Content: bar, Metadata: {'page': 1}

以下是调用 similarity_search 时如何设置 fetch_k 参数的示例。通常您需要 fetch_k 参数 >> k 参数。这是因为 fetch_k 参数是过滤之前将获取的文档数。如果将 fetch_k 设置为较小的数字,您可能无法获得足够的文档进行过滤。

示例代码,

results = db.similarity_search("foo", filter=dict(page=1), k=1, fetch_k=4)
for doc in results:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

输出结果,

Content: foo, Metadata: {'page': 1}

完结!

http://www.yayakq.cn/news/472129/

相关文章:

  • 国外公司查询网站网站 什么语言开发的
  • 最好的网站设做问卷的网站有那些
  • 专业微信网站开发公司仿70网站分类目录源码
  • 阿里巴巴怎么做不花钱的网站个人可以做社区网站有哪些
  • 国企集团门户网站建设方案萧县住房和城乡建设局网站
  • 容桂营销网站建设孝感网站制作
  • 淄博企业网站建设价格旅游网站建设系统
  • 做网站要备案吗 要几天河南火焰山网站开发禹
  • 配资网站建设多少钱等保二级网站建设方案
  • 动图在线制作网站一流高职院校建设专题网站
  • 桂林网站seo品牌建设工作计划
  • 高端的网站制作8小8x人2022成免费入口
  • 佛山 网站建设培训班福田网站建设推荐
  • 制作网站的过程农产品网络营销方案
  • 营销网站建设资料企业门户网站设计
  • 网站改域名备案重庆高校在线开放课程平台
  • 做视频网站技术壁垒在哪里企业管理公司取名字大全
  • 订餐网站开发流程dw做网站小技巧
  • 在哪找做网站的抖音广告推广
  • 科技广告公司网站模板河北石家庄地图
  • 做物理的网站彩票走势图网站建设
  • 免费图片素材网站有哪些工程招标信息网下载
  • vue.js合作做网站么做网站设计图用什么软件
  • 怎么做网站反向链接做百度网站的公司哪家好
  • 网站主页设计注意点百度q3财报2022
  • 京东商城网站风格网站建设 沈阳
  • 建站时长是什么原因造成的gif图标网站
  • 手机网站建设+上海新手 网站建设 书籍
  • 网站做淘宝客需要什么网页设计个人主页
  • 如何做公司的网站带搜索的下拉框网站