英语 英语 日语 日语 韩语 韩语 法语 法语 德语 德语 西班牙语 西班牙语 意大利语 意大利语 阿拉伯语 阿拉伯语 葡萄牙语 葡萄牙语 越南语 越南语 俄语 俄语 芬兰语 芬兰语 泰语 泰语 泰语 丹麦语 泰语 对外汉语

2023年经济学人 人工智能公司的数据争夺战(下)

时间:2024-01-25 08:21来源:互联网 提供网友:nan   字体: [ ]
特别声明:本栏目内容均从网络收集或者网友提供,供仅参考试用,我们无法保证内容完整和正确。如果资料损害了您的权益,请与站长联系,我们将及时删除并致以歉意。
    (单词翻译:双击或拖选)

 

The upshot has been a flurry of dealmaking as AI companies race to secure data sources.

结果是随着人工智能公司竞相获取数据来源,一系列交易被达成。

In July OpenAI inked a deal with Associated Press, a news agency, to access its archive of stories.

今年7月,OpenAI与新闻机构美联社签署了一项协议,目的是使用其新闻报道。

It has also recently expanded an agreement with Shutterstock, a provider of stock photography, with which Meta has a deal, too.

OpenAI最近还扩大了与Shutterstock(一家版权图片提供商)的协议,Meta也与其做了交易。

On August 8th it was reported that Google was in discussions with Universal Music, a record label, to license1 artists’ voices to feed a songwriting AI tool.

8月8日,有报道称,谷歌正在与唱片公司环球音乐洽谈,希望授权把歌手的声音输入给一个编写歌曲的AI工具。

Rumours2 swirl3 about AI labs approaching the BBC, Britain’s public broadcaster.

关于各家AI实验室与英国公共广播公司BBC接洽的谣言也沸沸扬扬。

Another supposed target is JSTORE, a digital library of academic journals.

另一个假定的目标是JSTOR,一个收纳学术期刊的数字图书馆。

Holders4 of information are taking advantage of their greater bargaining power.

信息持有者正在利用他们更大的议价能力。

Reddit, a discussion forum5, and Stack Overflow6, a question-and-answer site popular with coders, have increased the cost of access to their data.

论坛Reddit和深受程序员欢迎的问答网站Stack Overflow提高了访问其数据的成本。

Both websites are particularly valuable because users “upvote” preferred answers, helping7 models know which are most relevant.

这两个网站都特别有价值,因为用户会投票把更好的回答“顶上去”,从而帮助模型了解哪些回答最有价值。

Twitter (now known as X), a social-media site, has put in place measures to limit the ability of bots to scrape the site and now charges anyone who wishes to access its data.

社交媒体网站推特(现已更名为X)已经采取措施,限制机器人盗取其网站数据的能力,并向任何想要访问其数据的人收费。

Elon Musk8, its mercurial9 owner, is planning to build his own AI business using the data.

推特的老板--捉摸不定的埃隆·马斯克--正计划利用这些数据建立自己的人工智能业务。

As a consequence, model-builders are working hard to improve the quality of the inputs11 they already have.

因此,模型建造者正在努力提高现有数据的质量。

Many AI labs employ armies of data annotators to perform tasks such as labelling images and rating answers.

许多AI实验室雇佣了大量的数据注释员,来执行诸如给图像标记和给答案评分的任务。

Some of that work is complex; an advert12 for one such job seeks applicants13 with a master’s degree or doctorate14 in life sciences.

其中一些工作很复杂,有一条这类工作的招聘广告希望应聘人有生命科学硕士或博士学位。

But much of it is mundane15, and is being outsourced to places such as Kenya where labour is cheap.

但大多数工作很单调,并被外包到肯尼亚等劳动力廉价的地方。

AI firms are also gathering16 data through users’ interactions with their tools.

人工智能公司也在通过用户与其工具的互动来收集数据。

Many of these have a feedback mechanism17, where users indicate which outputs are useful.

其中许多都有反馈机制,用户可以指出哪些输出是有用的。

Firefly’s text-to-image generator18 allows users to pick from one of four options.

萤火虫的文本转图像生成器允许用户从四个选项中进行选择。

Bard19, Google’s chatbot, proposes three answers.

谷歌的聊天机器人巴德会给出三个答案。

Users can give ChatGPT a thumbs-up or thumbs-down to its responses.

用户可以对ChatGPT的回复点击“喜欢”或“不喜欢”。

That information can be fed back as an input10 into the underlying20 model, forming what Douwe Kiela, co-founder of Contextual AI, a startup, calls the “data flywheel”.

这些信息可以再反馈回底层模型,形成初创公司Context AI的联合创始人杜威·基拉所说的“数据飞轮”。

A stronger signal still of the quality of a chatbot’s answers is whether users copy the text and paste it elsewhere, he adds.

他补充说,表明聊天机器人的回答质量高的一个更有力的信号是,用户会把文本复制并粘贴到其他地方。

That information helped Google rapidly improve its translation tool.

这些信息帮助谷歌迅速改进了其翻译工具。

There is, however, one source of data that remains21 largely untapped: the information that exists within the walls of the tech firms’ corporate22 customers.

然而,有一个数据来源在很大程度上仍未被开发:科技公司企业客户的内部信息。

Many businesses possess, often unwittingly, vast amounts of useful data, from call-centre transcripts23 to customer spending records.

许多企业拥有大量有用的数据,从客服中心的文字记录到客户的消费记录,这些数据往往都是在无意中掌握的。

Such information is especially valuable because it can be used to fine-tune models for specific business purposes, such as helping call-centre workers answer queries24 or analysts25 spot ways to boost sales.

这类信息特别有价值,因为可以用来微调模型而达到特定的商业目的,比如帮助客服中心的工作人员回答问题,或者帮助分析师找到提高销量的方法。

Yet making use of that rich resource is not always straightforward26.

然而,这些丰富的资源并不总是可以直接利用。

Roy Singh of Bain, a consultancy, notes that most firms have historically paid little attention to the types of vast but unstructured datasets that would prove most useful for training AI tools.

贝恩咨询公司的罗伊·辛格指出,过去大多数公司几乎没有注意到那些海量但非结构化的数据集,这些数据集对训练AI工具是最有用的。

Often these are spread across various systems, buried in company servers rather than in the cloud.

这些数据通常分布在不同的系统中,深藏在公司的服务器里,而不是在云端。

Unlocking that information would help companies customise AI tools to serve their needs better.

解锁这些信息将帮助公司按需要创造AI工具,以更好地满足他们的需求。

Amazon and Microsoft, two tech giants, now offer tools to help companies improve management of their unstructured datasets, as does Google.

亚马逊和微软这两家科技巨头现在提供工具,帮助公司改善对非结构化数据集的管理,谷歌也有类似行动。

Christian27 Kleinerman of Snowflake, a database firm, says that business is booming as clients look to “tear down data silos”.

来自数据库公司雪花的克里斯蒂安·克莱纳曼表示,随着客户想要“拆除储存数据的筒仓”,数据业务正在蓬勃发展。

Startups are piling in.

初创企业正蜂拥而至。

In April Weaviate, an AI-focused database business, raised $50m at a valuation of $200m.

今年4月,专注于人工智能的数据库企业Weaviate以2亿美元的估值筹集了5000万美元。

Barely a week later PineCone, a rival, raised $100m at a $750m valuation.

仅仅一周后,其竞争对手PineCone就以7.5亿美元的估值筹集了1亿美元。

Earlier this month Neon, another database startup, raised an additional $46m in funding.

本月初,另一家数据库初创公司Neon又筹集了4600万美元的资金。

The scramble28 for data is only just getting started.

数据争夺战才刚刚开始。


点击收听单词发音收听单词发音  

1 license B9TzU     
n.执照,许可证,特许;v.许可,特许
参考例句:
  • The foreign guest has a license on the person.这个外国客人随身携带执照。
  • The driver was arrested for having false license plates on his car.司机由于使用假车牌而被捕。
2 rumours ba6e2decd2e28dec9a80f28cb99e131d     
n.传闻( rumour的名词复数 );风闻;谣言;谣传
参考例句:
  • The rumours were completely baseless. 那些谣传毫无根据。
  • Rumours of job losses were later confirmed. 裁员的传言后来得到了证实。
3 swirl cgcyu     
v.(使)打漩,(使)涡卷;n.漩涡,螺旋形
参考例句:
  • The car raced roughly along in a swirl of pink dust.汽车在一股粉红色尘土的漩涡中颠簸着快速前进。
  • You could lie up there,watching the flakes swirl past.你可以躺在那儿,看着雪花飘飘。
4 holders 79c0e3bbb1170e3018817c5f45ebf33f     
支持物( holder的名词复数 ); 持有者; (支票等)持有人; 支托(或握持)…之物
参考例句:
  • Slaves were mercilessly ground down by slave holders. 奴隶受奴隶主的残酷压迫。
  • It is recognition of compassion's part that leads the up-holders of capital punishment to accuse the abolitionists of sentimentality in being more sorry for the murderer than for his victim. 正是对怜悯的作用有了认识,才使得死刑的提倡者指控主张废除死刑的人感情用事,同情谋杀犯胜过同情受害者。
5 forum cilx0     
n.论坛,讨论会
参考例句:
  • They're holding a forum on new ways of teaching history.他们正在举行历史教学讨论会。
  • The organisation would provide a forum where problems could be discussed.这个组织将提供一个可以讨论问题的平台。
6 overflow fJOxZ     
v.(使)外溢,(使)溢出;溢出,流出,漫出
参考例句:
  • The overflow from the bath ran on to the floor.浴缸里的水溢到了地板上。
  • After a long period of rain,the river may overflow its banks.长时间的下雨天后,河水可能溢出岸来。
7 helping 2rGzDc     
n.食物的一份&adj.帮助人的,辅助的
参考例句:
  • The poor children regularly pony up for a second helping of my hamburger. 那些可怜的孩子们总是要求我把我的汉堡包再给他们一份。
  • By doing this, they may at times be helping to restore competition. 这样一来, 他在某些时候,有助于竞争的加强。
8 musk v6pzO     
n.麝香, 能发出麝香的各种各样的植物,香猫
参考例句:
  • Musk is used for perfume and stimulant.麝香可以用作香料和兴奋剂。
  • She scented her clothes with musk.她用麝香使衣服充满了香味。
9 mercurial yCnxD     
adj.善变的,活泼的
参考例句:
  • He was of a mercurial temperament and therefore unpredictable.他是个反复无常的人,因此对他的行为无法预言。
  • Our desires and aversions are mercurial rulers.我们的欲望与嫌恶是变化无常的统治者。
10 input X6lxm     
n.输入(物);投入;vt.把(数据等)输入计算机
参考例句:
  • I will forever be grateful for his considerable input.我将永远感激他的大量投入。
  • All this information had to be input onto the computer.所有这些信息都必须输入计算机。
11 inputs a8aff967e1649a1c82ea607c881e8091     
n.输入( input的名词复数 );投入;输入端;输入的数据v.把…输入电脑( input的第三人称单数 )
参考例句:
  • Uncheck the inputs checked for optimization in the previous stage. 不测试那些已经测试过的优化了的以前步骤的inputs.(变量参数)。 来自互联网
  • Just in case, save in a file the inputs obtained at the previous stage. 以防万一,保存以前步骤获得的inputs(变量参数值)到一个文件中去。 来自互联网
12 advert eVLzj     
vi.注意,留意,言及;n.广告
参考例句:
  • The advert featured a dolphin swimming around a goldfish bowl.该广告的內容为一条在金鱼缸里游动的海豚。
  • Please advert to the contents below.I believe you won't be disappointed.敬请留意后面的内容。相信您一定不会失望的。
13 applicants aaea8e805a118b90e86f7044ecfb6d59     
申请人,求职人( applicant的名词复数 )
参考例句:
  • There were over 500 applicants for the job. 有500多人申请这份工作。
  • He was impressed by the high calibre of applicants for the job. 求职人员出色的能力给他留下了深刻印象。
14 doctorate fkEzt     
n.(大学授予的)博士学位
参考例句:
  • He hasn't enough credits to get his doctorate.他的学分不够取得博士学位。
  • Where did she do her doctorate?她在哪里攻读博士?
15 mundane F6NzJ     
adj.平凡的;尘世的;宇宙的
参考例句:
  • I hope I can get an interesting job and not something mundane.我希望我可以得到的是一份有趣的工作,而不是一份平凡无奇的。
  • I find it humorous sometimes that even the most mundane occurrences can have an impact on our awareness.我发现生活有时挺诙谐的,即使是最平凡的事情也能影响我们的感知。
16 gathering ChmxZ     
n.集会,聚会,聚集
参考例句:
  • He called on Mr. White to speak at the gathering.他请怀特先生在集会上讲话。
  • He is on the wing gathering material for his novels.他正忙于为他的小说收集资料。
17 mechanism zCWxr     
n.机械装置;机构,结构
参考例句:
  • The bones and muscles are parts of the mechanism of the body.骨骼和肌肉是人体的组成部件。
  • The mechanism of the machine is very complicated.这台机器的结构是非常复杂的。
18 generator Kg4xs     
n.发电机,发生器
参考例句:
  • All the while the giant generator poured out its power.巨大的发电机一刻不停地发出电力。
  • This is an alternating current generator.这是一台交流发电机。
19 bard QPCyM     
n.吟游诗人
参考例句:
  • I'll use my bard song to help you concentrate!我会用我的吟游诗人歌曲帮你集中精神!
  • I find him,the wandering grey bard.我发现了正在徘徊的衰老游唱诗人。
20 underlying 5fyz8c     
adj.在下面的,含蓄的,潜在的
参考例句:
  • The underlying theme of the novel is very serious.小说隐含的主题是十分严肃的。
  • This word has its underlying meaning.这个单词有它潜在的含义。
21 remains 1kMzTy     
n.剩余物,残留物;遗体,遗迹
参考例句:
  • He ate the remains of food hungrily.他狼吞虎咽地吃剩余的食物。
  • The remains of the meal were fed to the dog.残羹剩饭喂狗了。
22 corporate 7olzl     
adj.共同的,全体的;公司的,企业的
参考例句:
  • This is our corporate responsibility.这是我们共同的责任。
  • His corporate's life will be as short as a rabbit's tail.他的公司的寿命是兔子尾巴长不了。
23 transcripts 525c0b10bb61e5ddfdd47d7faa92db26     
n.抄本( transcript的名词复数 );转写本;文字本;副本
参考例句:
  • Like mRNA, both tRNA and rRNA are transcripts of chromosomal DNA. tRNA及rRNA同mRNA一样,都是染色体DNA的转录产物。 来自辞典例句
  • You can't take the transfer students'exam without your transcripts. 没有成绩证明书,你就不能参加转学考试。 来自辞典例句
24 queries 5da7eb4247add5dbd5776c9c0b38460a     
n.问题( query的名词复数 );疑问;询问;问号v.质疑,对…表示疑问( query的第三人称单数 );询问
参考例句:
  • Our assistants will be happy to answer your queries. 我们的助理很乐意回答诸位的问题。
  • Her queries were rhetorical,and best ignored. 她的质问只不过是说说而已,最好不予理睬。 来自《简明英汉词典》
25 analysts 167ff30c5034ca70abe2d60a6e760448     
分析家,化验员( analyst的名词复数 )
参考例句:
  • City analysts forecast huge profits this year. 伦敦金融分析家预测今年的利润非常丰厚。
  • I was impressed by the high calibre of the researchers and analysts. 研究人员和分析人员的高素质给我留下了深刻印象。
26 straightforward fFfyA     
adj.正直的,坦率的;易懂的,简单的
参考例句:
  • A straightforward talk is better than a flowery speech.巧言不如直说。
  • I must insist on your giving me a straightforward answer.我一定要你给我一个直截了当的回答。
27 Christian KVByl     
adj.基督教徒的;n.基督教徒
参考例句:
  • They always addressed each other by their Christian name.他们总是以教名互相称呼。
  • His mother is a sincere Christian.他母亲是个虔诚的基督教徒。
28 scramble JDwzg     
v.爬行,攀爬,杂乱蔓延,碎片,片段,废料
参考例句:
  • He broke his leg in his scramble down the wall.他爬墙摔断了腿。
  • It was a long scramble to the top of the hill.到山顶须要爬登一段长路。
本文本内容来源于互联网抓取和网友提交,仅供参考,部分栏目没有内容,如果您有更合适的内容,欢迎点击提交分享给大家。
------分隔线----------------------------
TAG标签:   2023年听力  经济学人
顶一下
(0)
0%
踩一下
(0)
0%
最新评论 查看所有评论
发表评论 查看所有评论
请自觉遵守互联网相关的政策法规,严禁发布色情、暴力、反动的言论。
评价:
表情:
验证码:
听力搜索
推荐频道
论坛新贴