WorldDigitalTechnologyAcademy(WDTA) LargeLanguageModelSecurity TestingMethod WorldDigitalTechnologyAcademyStandard WDTAAI-STR-02 Edition:2024-04©WDTA2024–Allrightsreserved. TheWorldDigitalTechnologyStandardWDTAAI-STR-02isdesignatedasaWDTA norm.ThisdocumentisthepropertyoftheWorldDigitalTechnologyAcademy(WDTA)andis protectedbyinternationalcopyrightlaws.Anyuseofthisdocument,includingreproduction, modification,distribution,orrepublication,withoutthepriorwrittenpermissionofWDTA,is prohibited.WDTAisnotliableforanyerrorsoromissionsinthisdocument. DiscovermoreWDTAstandardandrelatedpublicationsathttps://wdtacademy.org/. VersionHistory* StandardID Version Date Changes WDTAAI-STR-02 1.0 2024-04 InitialReleaseForeword The"LargeLanguageModelSecurityTestingMethod,"developedandissuedbytheWorldDigital TechnologyAcademy(WDTA),representsacrucialadvancementinourongoingcommitmentto ensuringtheresponsibleandsecureuseofartificialintelligencetechnologies.AsAIsystems, particularlylargelanguagemodels,continuetobecomeincreasinglyintegraltovariousaspectsof society,theneedforacomprehensivestandardtoaddresstheirsecuritychallengesbecomes paramount.Thisstandard,anintegralpartofWDTA'sAISTR(Safety,Trust,Responsibility)program, isspecificallydesignedtotacklethecomplexitiesinherentinlargelanguagemodelsandprovide rigorousevaluationmetricsandprocedurestotesttheirresilienceagainstadversarialattacks. Thisstandarddocumentprovidesaframeworkforevaluatingtheresilienceoflargelanguagemodels (LLMs)againstadversarialattacks.TheframeworkappliestothetestingandvalidationofLLMs acrossvariousattackclassifications,includingL1Random,L2Blind-Box,L3Black-Box,andL4 White-Box.KeymetricsusedtoassesstheeffectivenessoftheseattacksincludetheAttackSuccess Rate(R)andDeclineRate(D).Thedocumentoutlinesadiverserangeofattackmethodologies,such asinstructionhijackingandpromptmasking,tocomprehensivelytesttheLLMs'resistanceto differenttypesofadversarialtechniques.Thetestingproceduredetailedinthisstandarddocument aimstoestablishastructuredapproachforevaluatingtherobustnessofLLMsagainstadversarial attacks,enablingdevelopersandorganizationstoidentifyandmitigatepotentialvulnerabilities,and ultimatelyimprovethesecurityandreliabilityofAIsystemsbuiltusingLLMs. Byestablishingthe"LargeLanguageModelSecurityTestingMethod,"WDTAseekstoleadtheway increatingadigitalecosystemwhereAIsystemsarenotonlyadvancedbutalsosecureandethically aligned.Itsymbolizesourdedicationtoafuturewheredigitaltechnologiesaredevelopedwithakeen senseoftheirsocietalimplicationsandareleveragedforthegreaterbenefitofall. ExecutiveChairmanofWDTA

.pdf文档 CSA Large Language Model Security Testing Method

文档预览
中文文档 22 页 50 下载 1000 浏览 0 评论 309 收藏 3.0分
温馨提示:本文档共22页,可预览 3 页,如浏览全部内容或当前文档出现乱码,可开通会员下载原始文档
CSA Large Language Model Security Testing Method 第 1 页 CSA Large Language Model Security Testing Method 第 2 页 CSA Large Language Model Security Testing Method 第 3 页
下载文档到电脑,方便使用
本文档由 人生无常 于 2024-04-21 15:01:52上传分享
友情链接
站内资源均来自网友分享或网络收集整理,若无意中侵犯到您的权利,敬请联系我们微信(点击查看客服),我们将及时删除相关资源。