lomas.generator submodule
- class lomas.generator.GeneratorCommon(ip_id_dict, cdf_size)
- __init__(ip_id_dict, cdf_size)
- generate(time_limit, time_unit)
- initialize(trace_input)
- poisson(lam)
- class lomas.generator.GeneratorLomas(ip_id_dict, ordered_ippair, cdf_iat, cdf_size)
- __init__(ip_id_dict, ordered_ippair, cdf_iat, cdf_size)
基于历史流量数据进行模型训练、基于训练好的模型产生新的合成流量数据
- 参数:
ip_id_dict (dict) -- key=index of IP, value=(anonymized)IP addr
ordered_ippair (list) -- ordered IP pair (IP is represented by its index)
cdf_iat (dict) -- key=percentile, value=values of interarrival time CDF at some percentile
cdf_size (dict) -- key=percentile, value=values of flow size CDF at some percentile
- generate(time_limit, time_unit)
生成新的合成流量数据
- initialize(trace_input)
获取每个源目的对之间的流数据,并以二维数组的数据类型储存
- 参数:
trace_input (pandas.DataFrame) -- can be accessed using lomas.Preprocessor.trace_input
- 返回:
self.arr_flow_type, self.dictionary, self.corpus
- 返回类型:
2D list, 1D list, 2D list
- sampling_helper(cdf, tag)
辅助函数,将离散化的流大小、流间隔标签映射回实数值
- sampling_value(doc_idx)
从隐空间概率分布矩阵中采样,以概率分布产生流大小和流间隔的联合取值
- train(num_topics=25, chunksize=2000, passes=20, iterations=400, eval_every=None)
模型训练
- 参数:
- 返回:
self.doc_topics (document-topic distribution), self.topic_terms (topic-word distribution)
- 返回类型:
2D np.array, 2D np.array