History: Preparing Machine Learning Dataset
Source of version: 11 (current)
Copy to clipboard
^This page ((needs review))^ ((Trackers)) make it possible to setup data sources for training Machine Learning models in Tiki without requiring any special tools. ! Preparing Machine Learning Dataset Preparing dataset in Tiki involves creating a new Tiki tracker and adding items to it or using an existing tracker and making any necessary modifications to it. This is an important step because using Machine Learning models to make predictions require the models to be trained beforehand on already existing data. ((Trackers)) in Tiki are used for this purpose because of the ease in working with data contained in them just within Tiki, eliminating the need for special tools. !! Creating Dataset You create a dataset in Tiki by simply creating a tracker. Do this by going to the __List Trackers__ page, click on the __Create__ button and fill in the __Create tracker__ form like you would for every other tracker. {img src="display1888" link="display1888" width="400" rel="box[g]" imalign="center" desc="Use the create button on List Trackers page to create a tracker" align="center" styleimage="border"} {img src="display1889" link="display1889" width="400" rel="box[g]" imalign="center" desc="Fill in the Create tracker form" align="center" styleimage="border"} Give it a name and a concise description and save; then add fields to the tracker to match your dataset attributes. That is, you add a field for every dataset attribute. {img src="display1890" link="display1890" width="400" rel="box[g]" imalign="center" desc="Add a tracker field by clicking on the Add Field button" align="center" styleimage="border"} {img src="display1891" link="display1891" width="400" rel="box[g]" imalign="center" desc="Add new field form" align="center" styleimage="border"} The tracker field type you choose will depend on the attribute type. ((Tracker Field Types)) has a list of available Tiki Tracker field types. Do not use multi-value field types like ((Checkbox Tracker Field)). Only use single-value field types such as ((Text Tracker Field)). Use ((Numeric Tracker Field)) for all continuous attributes. Note that the field types you choose will directly affect the user experience during model use. For categorical attributes, consider using field types like Dropdown Tracker Field or Radio Tracker Field to ease user input during usage. See ((Creating a Tracker)) for more information on creating trackers. To learn how to add fields to a tracker, see ((Adding fields to a tracker)). !! Adding Samples to Dataset In Tiki, each tracker item represent a sample. You add dataset samples by adding items to the tracker. A simple way to do this is to use the __Create Item__ form displayed after clicking on the __Create Item__ button in the view tracker page. {img src="display1892" link="display1892" width="400" rel="box[g]" imalign="center" desc="Click on Create Item button to add tracker item" align="center" styleimage="border"} {img src="display1893" link="display1893" width="400" rel="box[g]" imalign="center" desc="Create tracker item form" align="center" styleimage="border"} Samples can also be collected from user input if the dataset tracker is embedded in a wiki, article or blog page using ((PluginTracker)). !! Importing Samples If dataset already exist in an external source such as a [https://en.wikipedia.org/wiki/Comma-separated_values|Comma-Separated Values (CSV)] file, you can import samples from it to a tracker in Tiki using ((Tracker Import Export)) for this. Say you are working on [https://github.com/RubixML/Divorce|Divorce Prediction] project, you can [https://github.com/RubixML/Divorce/blob/master/dataset.csv|download the project's dataset] into your computer, then using Tracker Tabular in Tiki, import the samples into a tracker. {img src="display1894" link="display1894" width="400" rel="box[g]" imalign="center" desc="Tracker Tabular list" align="center" styleimage="border"} {img src="display1895" link="display1895" width="400" rel="box[g]" imalign="center" desc="Use the Import button to import items into tracker" align="center" styleimage="border"} See ((Tracker Import Export)) to learn how to import tracker items using Tracker Tabular. !! Related links * ((Machine Learning)) * ((Creating Machine Learning Models)) * ((Configuring Machine Learning Models)) * ((Training Machine Learning Models)) * ((Using Machine Learning Models)) * ((Trackers)) * ((Tracker Field Types)) * https://github.com/RubixML/Divorce