贝叶斯网络R语言学习笔记1
2021/7/19 23:08:24
本文主要是介绍贝叶斯网络R语言学习笔记1,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
贝叶斯网络R语言学习笔记1
2021年7月19日19:54:12
一、创建贝叶斯网络结构
1、创建空网络
贝叶斯网络的图结构存储在bn对象中,可以通过三种表示来创建bn对象,即the arc set of the graph, its adjacency matrix or a model formula(边集、邻接矩阵、模型公式)。此外,可以通过empty.graph()
和random.graph()
函数创建空网络结构或随机网络结构。
# 产生空网络 library(bnlearn) e = empty.graph(LETTERS[1:6]) e class(e) # 结果 Random/Generated Bayesian network model: [A][B][C][D][E][F] nodes: 6 arcs: 0 undirected arcs: 0 directed arcs: 0 average markov blanket size: 0.00 average neighbourhood size: 0.00 average branching factor: 0.00 generation algorithm: Empty > class(e) [1] "bn"
通常empty.graph()默认只产生一个bn
对象,可以指定参数num,使之产生一批网络对象。
empty.graph(LETTERS[1:6], num = 2) #此处指定num = 2,产生两个bn对象
2、创建网络结构
①特定的边集
要产生边集,首先需要指定一个边集的矩阵。以下代码产生一个矩阵,两列,按行填充,行名为from,to。
arc.set = matrix(c("A", "C", "B", "F", "C", "F"), ncol = 2, byrow = TRUE, dimnames = list(NULL, c("from", "to"))) arc.set from to [1,] "A" "C" [2,] "B" "F" [3,] "C" "F"
将产生的边集arc.set传给bn
对象使用arcs()
函数
arcs(e) = arc.set e Random/Generated Bayesian network model: [A][B][D][E][C|A][F|B:C] nodes: 6 arcs: 3 undirected arcs: 0 directed arcs: 3 average markov blanket size: 1.33 average neighbourhood size: 1.00 average branching factor: 0.50 generation algorithm: Empty
对于arcs()
函数,其会对边集做一定的检查,来看是否符合上下条件。
-
边的标签要和图里的标签一致
bogus = matrix(c("X", "Y", "W", "Z"), ncol = 2, byrow = TRUE, dimnames = list(NULL, c("from", "to"))) bogus from to [1,] "X" "Y" [2,] "W" "Z"
将bogus
传给bn
对象e
,会出现报错
arcs(e) = bogus Error in check.arcs(value, nodes = names(x$nodes)) : node(s) 'X' 'W' 'Y' 'Z' not present in the graph.
-
不能引入环(e.g. A → B → C → A) ,除非我们设置
ignore.cycles = TURE
cycle = matrix(c("A", "C", "C", "B", "B", "A"), ncol = 2, byrow = TRUE, dimnames = list(NULL, c("from", "to"))) cycle from to [1,] "A" "C" [2,] "C" "B" [3,] "B" "A" arcs(e) = cycle Error in `arcs<-`(`*tmp*`, value = c("A", "C", "B", "C", "B", "A")) : the specified network contains cycles. arcs(e,ignore.cycles = TRUE) = cycle Error in `arcs<-`(`*tmp*`, ignore.cycles = TRUE, value = c("A", "C", "B", : 参数没有用(ignore.cycles = TRUE) > acyclic(e) # check whether the graph is acyclic/completely directed.检测是否无环 [1] TRUE
-
不能引入圈 (e.g. A → A).
> loops = matrix(c("A", "A", "B", "B", "C", "D"), - ncol = 2, byrow = TRUE, - dimnames = list(NULL, c("from", "to"))) > loops from to [1,] "A" "A" [2,] "B" "B" [3,] "C" "D" > arcs(e) = loops Error in check.arcs(value, nodes = names(x$nodes)) : invalid arcs that are actually loops: A -> A B -> B
-
可以通过在边集中包括双向边引入非直接边(e.g. A → B and B → A),保证双向均未引入环,设置
ignore.cycles
会覆盖这个检查。
> edges = matrix(c("A", "B", "B", "A", "C", "D"), + ncol = 2, byrow = TRUE, + dimnames = list(NULL, c("from", "to"))) > edges from to [1,] "A" "B" [2,] "B" "A" [3,] "C" "D" > arcs(e) = edges > e Random/Generated Bayesian network model: [partially directed graph] nodes: 6 arcs: 2 undirected arcs: 1 #这里有个非直接环弧 directed arcs: 1 average markov blanket size: 0.67 average neighbourhood size: 0.67 average branching factor: 0.17 generation algorithm: Empty
②特定的的邻接矩阵
使用amat()
函数可以用矩阵来产生图结构。
我们首先需要做的是产生一个矩阵,矩阵的元素为0/1整数(注意这里使用0L
和1L
,0和1也是可以的,但是会占更多的内存,这在大数据集尤为重要),1L就代表存在一条弧。
> adj = matrix(0L, ncol = 6, nrow = 6, + dimnames = list(LETTERS[1:6], LETTERS[1:6])) > adj A B C D E F A 0 0 0 0 0 0 B 0 0 0 0 0 0 C 0 0 0 0 0 0 D 0 0 0 0 0 0 E 0 0 0 0 0 0 F 0 0 0 0 0 0 > adj["A", "C"] = 1L > adj["B", "F"] = 1L > adj["C", "F"] = 1L > adj["D", "E"] = 1L > adj["A", "E"] = 1L > adj A B C D E F A 0 0 1 0 1 0 B 0 0 0 0 0 1 C 0 0 0 0 0 1 D 0 0 0 0 1 0 E 0 0 0 0 0 0 F 0 0 0 0 0 0
接着使用amat()
函数将弧(arcs)引入到图中
> amat(e) = adj > e Random/Generated Bayesian network model: [A][B][D][C|A][E|A:D][F|B:C] nodes: 6 arcs: 5 undirected arcs: 0 directed arcs: 5 average markov blanket size: 2.33 average neighbourhood size: 1.67 average branching factor: 0.83 generation algorithm: Empty
注意:amat()
函数和arcs()
函数一样,都会检查错误的图结构
③特定的模型公式
模型公式的格式来源于deal包,定义如下
- 每个节点在方括号中
- 每个节点的父节点在列在管道符后(级竖杆符号
|
),父节点间用冒号:
隔开,但是都是在一个方括号中
可以采用两种方式来定义图结构
使用model2network()
函数,不用先产生空图
> model2network("[A][C][B|A][D|C][F|A:B:C][E|F]") Random/Generated Bayesian network model: [A][C][B|A][D|C][F|A:B:C][E|F] nodes: 6 arcs: 6 undirected arcs: 0 directed arcs: 6 average markov blanket size: 2.67 average neighbourhood size: 2.00 average branching factor: 1.00 generation algorithm: Empty
使用modelstring()
函数和一个已存在的bn
对象,类似之前的arcs()
和amat()
> modelstring(e) = "[A][C][B|A][D|C][F|A:B:C][E|F]" > e Random/Generated Bayesian network model: [A][C][B|A][D|C][F|A:B:C][E|F] nodes: 6 arcs: 6 undirected arcs: 0 directed arcs: 6 average markov blanket size: 2.67 average neighbourhood size: 2.00 average branching factor: 1.00 generation algorithm: Empty
注意:model2network()
和modelstring()
函数同amat()
函数、arcs()
函数一样,都会检查错误的图结构。
3、创建一个或多个随机结构
①特定节点顺序
默认情况下,random.graph()
函数产生的图和提供的节点顺序一致;在每个弧中,尾部节点排在头部节点的前面。弧是独立的样本,其包含概率由prob
参数指定,prob
参数的默认值设置为使图中平均有与节点一样多的弧。
> random.graph(LETTERS[1:6], prob = 0.1) #顺序是从A到F Random/Generated Bayesian network model: [A][B][D][E][C|B][F|C] #这里可以看出,[B]是排在[C|B]前面的 nodes: 6 arcs: 2 undirected arcs: 0 directed arcs: 2 average markov blanket size: 0.67 average neighbourhood size: 0.67 average branching factor: 0.33 generation algorithm: Full Ordering arc sampling probability: 0.1
和empty.graph()
函数一样,我们也可以指定num
参数来产生一批图,这样就会产生一个列表,可以使用lapply()
来很好地批量计算。
②以均匀概率从连通的有向无环图空间中抽样
除了使用默认的方法method = "ordered"
,还可以使用method = "ic-dag"
来从具有均匀概率的连通的有向无环图空间中抽样,使用 Ide & Cozman MCMC sampler
请注意,建议的老化长度(length of the burn-in scales ?)与节点数量成二次函数关系,因此使用num
批量生成多个图要比一次生成一个图高效得多。
> random.graph(LETTERS[1:6], num = 2, method = "ic-dag") [[1]] Random/Generated Bayesian network model: [C][E][A|E][B|A:E][D|B:E][F|A:C:D] nodes: 6 arcs: 8 undirected arcs: 0 directed arcs: 8 average markov blanket size: 3.67 average neighbourhood size: 2.67 average branching factor: 1.33 generation algorithm: Ide & Cozman's Multiconnected DAGs burn in length: 216 maximum in-degree: Inf maximum out-degree: Inf maximum degree: Inf [[2]] Random/Generated Bayesian network model: [E][A|E][B|A:E][C|A][D|B:E][F|A:C:D] nodes: 6 arcs: 9 undirected arcs: 0 directed arcs: 9 average markov blanket size: 3.67 average neighbourhood size: 3.00 average branching factor: 1.50 generation algorithm: Ide & Cozman's Multiconnected DAGs burn in length: 216 maximum in-degree: Inf maximum out-degree: Inf maximum degree: Inf
此方法对生成的图的结构接受几个可选的约束,如下面的参数所指定:
- burn.in: the number of iterations for the algorithm to converge to a stationary (and uniform) probability distribution.收敛迭代的次数
- every: return only one graph every number of steps instead of all the generated graphs. High values of every result in a more diverse set.每个步骤只返回一个图,而不是所有生成的图。every 的高值会导致更多样化的集合。
- max.degree: the maximum degree of a node.
- max.in.degree: the maximum in-degree.
- max.out.degree: the maximum out-degree.
举例
random.graph(LETTERS[1:6], num = 2, method = "ic-dag", burn.in = 10^4, + every = 50, max.degree = 3) [[1]] Random/Generated Bayesian network model: [B][F|B][A|F][C|A:F][D|A:B][E|B:C] nodes: 6 arcs: 8 undirected arcs: 0 directed arcs: 8 average markov blanket size: 3.33 average neighbourhood size: 2.67 average branching factor: 1.33 generation algorithm: Ide & Cozman's Multiconnected DAGs burn in length: 10000 maximum in-degree: Inf maximum out-degree: Inf maximum degree: 3 [[2]] Random/Generated Bayesian network model: [A][C|A][F|C][D|F][B|D:F][E|B:C] nodes: 6 arcs: 7 undirected arcs: 0 directed arcs: 7 average markov blanket size: 2.67 average neighbourhood size: 2.33 average branching factor: 1.17 generation algorithm: Ide & Cozman's Multiconnected DAGs burn in length: 10000 maximum in-degree: Inf maximum out-degree: Inf maximum degree: 3
③以均匀概率从有向无环图空间抽样
Melançon的 MCMC算法以均匀概率从有向无环图空间抽样(不需要连接not necessarily connected),指定method = "melancon"
即可。
> random.graph(LETTERS[1:6], method = "melancon") Random/Generated Bayesian network model: [A][B|A][F|A][C|A:B][E|B][D|B:C:E] nodes: 6 arcs: 8 undirected arcs: 0 directed arcs: 8 average markov blanket size: 3.00 average neighbourhood size: 2.67 average branching factor: 1.33 generation algorithm: Melancon's Uniform Probability DAGs burn in length: 216 maximum in-degree: Inf maximum out-degree: Inf maximum degree: Inf
至于可选的参数,它与Ide & Cozman算法是相同的。
4、总结
- 创建三种网络:空网络、特定网络、随机网络
- 创建特定网络网络的三种方法:边集、邻接矩阵、模型公式
- 存在的问题:对随机网络不太懂,为什么要创建随机网络,是给我们看看有这些节点可以产生什么样的结构吗?
- 随机网络的取样是啥意思?
这篇关于贝叶斯网络R语言学习笔记1的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2025-01-10百万架构师第十三课:源码分析:Spring 源码分析:Spring核心IOC容器及依赖注入原理|JavaGuide
- 2025-01-10便捷好用的电商API工具合集
- 2025-01-09必试!帮 J 人团队解决物流错发漏发的软件神器!
- 2025-01-09不容小觑!助力 J 人物流客服安抚情绪的软件!
- 2025-01-09为什么医疗团队协作离不开智能文档工具?
- 2025-01-09惊叹:J 人团队用啥软件让物流服务快又准?
- 2025-01-09如何利用数据分析工具优化项目资源分配?4种工具推荐
- 2025-01-09多学科协作难?这款文档工具可以帮你省心省力
- 2025-01-09团队中的技术项目经理TPM:工作内容与资源优化策略
- 2025-01-09JIT生产管理法:优化流程,提升竞争力的秘诀