ML .NET 二手车价格预测之再次训练与参数调整(二)
2022/3/7 23:18:51
本文主要是介绍ML .NET 二手车价格预测之再次训练与参数调整(二),对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
再次训练与参数调整
在UsedCarsPricePredictionMLModel.training.cs
文件下,有训练设置与训练模型的方法
BuildPipeline
方法中是ML .NET自动生成的训练设置,包括选择了哪些参数,预测的字段是什么,
以及调用LightGbm
方法,参数配置为
{ NumberOfLeaves=17, MinimumExampleCountPerLeaf=25, NumberOfIterations=6019, MaximumBinCountPerFeature=24, LearningRate=1F, LabelColumnName=@"Price", FeatureColumnName=@"Features", Booster=new GradientBooster.Options() { SubsampleFraction=0.706948120047722F, FeatureFraction=0.521537449021549F, L1Regularization=0.00247814105551342F, L2Regularization=0.00137211480690565F } }
这些都是由ML .NET自动生成好的推荐配置参数,如果本身对机器学习有所研究,可以在此基础上进行修改,以达到优化模型的作用
参考资料 LightGbmExtensions.LightGbm 方法
完整训练代码如下
public static IEstimator<ITransformer> BuildPipeline(MLContext mlContext) { // Data process configuration with pipeline data transformations var pipeline = mlContext.Transforms.Categorical.OneHotEncoding(new []{new InputOutputColumnPair(@"Fuel_Type", @"Fuel_Type"),new InputOutputColumnPair(@"Transmission", @"Transmission"),new InputOutputColumnPair(@"Owner_Type", @"Owner_Type")}) .Append(mlContext.Transforms.ReplaceMissingValues(new []{new InputOutputColumnPair(@"Year", @"Year"),new InputOutputColumnPair(@"Kilometers_Driven", @"Kilometers_Driven"),new InputOutputColumnPair(@"Seats", @"Seats")})) .Append(mlContext.Transforms.Text.FeaturizeText(@"Name", @"Name")) .Append(mlContext.Transforms.Text.FeaturizeText(@"Location", @"Location")) .Append(mlContext.Transforms.Text.FeaturizeText(@"Engine", @"Engine")) .Append(mlContext.Transforms.Text.FeaturizeText(@"Power", @"Power")) .Append(mlContext.Transforms.Concatenate(@"Features", new []{@"Fuel_Type",@"Transmission",@"Owner_Type",@"Year",@"Kilometers_Driven",@"Seats",@"Name",@"Location",@"Engine",@"Power"})) .Append(mlContext.Regression.Trainers.LightGbm(new LightGbmRegressionTrainer.Options(){NumberOfLeaves=17,MinimumExampleCountPerLeaf=25,NumberOfIterations=6019,MaximumBinCountPerFeature=24,LearningRate=1F,LabelColumnName=@"Price",FeatureColumnName=@"Features",Booster=new GradientBooster.Options(){SubsampleFraction=0.706948120047722F,FeatureFraction=0.521537449021549F,L1Regularization=0.00247814105551342F,L2Regularization=0.00137211480690565F}})); return pipeline; }
之后可以调用RetrainPipeline
方法再次训练,得到新的模型
public static ITransformer RetrainPipeline(MLContext context, IDataView trainData) { var pipeline = BuildPipeline(context); var model = pipeline.Fit(trainData); return model; }
获取model后保存成文件
//注意,这里使用txt或者tsv格式的文件 string trainCsvPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "TrainData", "train-data.txt"); string testCsvPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "TrainData", "test-data2.txt"); string modelDirectory = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Model"); string modelPath = Path.Combine(modelDirectory, "UsedCarsPricePredictionMLModel.zip"); MLContext mlContext = new MLContext(seed: 0); IDataView trainingDataView = mlContext.Data.LoadFromTextFile<ModelInput>(trainCsvPath, hasHeader: true); var model = UsedCarsPricePredictionMLModel.RetrainPipeline(mlContext, trainingDataView); if (!Directory.Exists(modelDirectory)) Directory.CreateDirectory(modelDirectory); mlContext.Model.Save(model, trainingDataView.Schema, modelPath);
小问题
问题1:
Property 'Column1' is missing the LoadColumnAttribute attribute
根据提示,需要为ModelInput
模型输入类的每个属性添加LoadColumn
特性,指明所在列
问题2:
Schema mismatch for input column 'Name_CharExtractor': expected Expected known-size vector of Single, got Vector<Single> Arg_ParamName_Name
根据ML.NET: Schema mismatch for input column 'AnswerFeaturized_CharExtractor': expected Expected Single or known-size vector of Single, got Vector.csv
文件,改为.txt
文件或者.tsv
文件
示例代码
UsedCarsPricePrediction
参考资料
10分钟快速入门
官方示例machinelearning-samples
教程:将回归与 ML.NET 配合使用以预测价格
这篇关于ML .NET 二手车价格预测之再次训练与参数调整(二)的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2022-03-01沐雪多租宝商城源码从.NetCore3.1升级到.Net6的步骤
- 2024-11-18微软研究:RAG系统的四个层次提升理解与回答能力
- 2024-11-15C#中怎么从PEM格式的证书中提取公钥?-icode9专业技术文章分享
- 2024-11-14云架构设计——如何用diagrams.net绘制专业的AWS架构图?
- 2024-05-08首个适配Visual Studio平台的国产智能编程助手CodeGeeX正式上线!C#程序员必备效率神器!
- 2024-03-30C#设计模式之十六迭代器模式(Iterator Pattern)【行为型】
- 2024-03-29c# datetime tryparse
- 2024-02-21list find index c#
- 2024-01-24convert toint32 c#
- 2024-01-24Advanced .Net Debugging 1:你必须知道的调试工具