[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"$fZ-2nk2bQL2Gazy3taX2xdOL_WSSkb3NOp5_bwVWdL3A":3},{"answer":4,"createTime":5,"id":6,"options":7,"origin":10,"question":13,"related":14,"source":24,"type":57},[],"2025-05-11 08:18:03",1060769796,[8,9],"对","错",{"courseImg":11,"courseName":12},"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002Fcf3bb414b5ea2367f316b2d3561124c7.jpg","[共享课]人工智能","贪心搜索算法一定能找到最优解,因为它总是朝着离目标状态靠近的方向生成和扩展节点.( )",[15,26,35,43,52,58,63,72,81,84],{"answer":16,"createTime":5,"id":17,"options":18,"question":23,"source":24,"type":25},[],1060768783,[19,20,21,22],"BFS","DFS","UCS","无","若一搜索树的树高有限且所有单步损耗均非负,则为每条边增加一正损耗c&gt;0,以下树搜索算法中( )所得搜索路径保持不变","v2",1,{"answer":27,"createTime":5,"id":28,"options":29,"question":33,"source":24,"type":34},[],1060768829,[30,31,32],"\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F97b167f3818a90dea33605a6ed34d7a7.png\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002Fcedeec654add2b9a6a5a787694ce6f00.png\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002Fb6e7c89a3f5b337c14d00444d8e0b40d.png\">","使用\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F29311cf92ac2797226068a7e6ae0bde8.png\">-贪心Q-learning算法得到的最优策略是( )",0,{"answer":36,"createTime":5,"id":37,"options":38,"question":42,"source":24,"type":34},[],1060768923,[39,40,41],"-1","-2","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F0667569d70d702a708ffd70eafae0159.png\">","一个MDP问题中有A,B,C这三个状态,智能体可以执行的动作是向右(\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F314a42688fdce41a09ed9f49b8584a7e.png\">),转移模型如下.我们据此完成无限次迭代的Q-learning.若衰减因子为1,学习率为1,则\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F1689b9d180a8ea9f0638df278b32f729.png\">( )\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F03a647b1f55e3d7f0768ba11068dbf8f.png\">",{"answer":44,"createTime":5,"id":45,"options":46,"question":51,"source":24,"type":25},[],1060769090,[47,48,49,50],"\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F9b90370e5ec69b2b59be48507b6e3572.png\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F1d4d44977437618fde6664aceef8a95d.png\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002Ff33327ccda535f9d90f8b9f6c47ef6d7.png\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F13d08f20bd847d79a33137bb55671741.png\">","下列公式正确的有( )",{"answer":53,"createTime":5,"id":54,"options":55,"question":56,"source":24,"type":57},[],1060769135,[8,9],"基于模型的强化学习涉及纯离线计算,而模型无关的强化学习需要与环境进行在线交互.( )",3,{"answer":59,"createTime":5,"id":60,"options":61,"question":62,"source":24,"type":57},[],1060769164,[8,9],"广度优先搜索可以找到步数最短的搜索路径,并且能保证路径的代价最小.( )",{"answer":64,"createTime":5,"id":65,"options":66,"question":71,"source":24,"type":25},[],1060769641,[67,68,69,70],"值迭代方法","状态迭代方法","策略迭代方法","回报迭代方法","在有模型的强化学习中,属于动态规划求解的是( )",{"answer":73,"createTime":5,"id":74,"options":75,"question":80,"source":24,"type":34},[],1060769751,[76,77,78,79],"\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F337926b18a7ceaabdfad5b2639b7f157.jpg\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F4380e14a56df3bb7de25cefb3358a2f9.jpg\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002Fc960c31c4270d294fcb0f674bb6fc0af.jpg\">","\u003Cimg src=\"https:\u002F\u002Ftihai-oss-cloud.itihey.com\u002Fimg\u002F5f869f433b69cf3430fc9bb56d268ccd.jpg\">","在强化学习值函数近似中,蒙特卡罗方法对参数的更新公式是( )",{"answer":82,"createTime":5,"id":6,"options":83,"question":13,"source":24,"type":57},[],[8,9],{"answer":85,"createTime":5,"id":86,"options":87,"question":92,"source":24,"type":25},[],1060770085,[88,89,90,91],"宽度优先搜索的特点是先生成的节点先扩展","深度优先搜索的特点是先生成的节点先扩展","深度优先搜索的特点是先扩展最新产生的节点","宽度优先搜索的特点是先扩展最新产生的节点","宽度优先搜索与深度优先搜索有何区别是( )"]