Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
152 commits
Select commit Hold shift + click to select a range
fe34bc7
Merge pull request #1 from oneapi-src/master
JoeOster Aug 11, 2020
632a7a4
Updating License file to no date in the title /*
JoeOster Aug 12, 2020
7a3358d
Merge pull request #2 from oneapi-src/master
JoeOster Aug 19, 2020
0032b0b
Update README.md
JoeOster Aug 19, 2020
dadbdab
Fix FPGA entries
akertesz Aug 19, 2020
11fdd3b
Update README.md
JoeOster Aug 20, 2020
ad94562
Merge pull request #3 from oneapi-src/master
JoeOster Aug 21, 2020
c9f8629
Update README.md
JoeOster Aug 23, 2020
8a450a0
Merge pull request #4 from oneapi-src/master
JoeOster Aug 26, 2020
84ce0f1
removing duplicate samples after transfering to dwarves folders
JoeOster Aug 26, 2020
c5c3880
Update Makefile.win
JoeOster Aug 26, 2020
0ead459
Update Makefile.win
JoeOster Aug 26, 2020
d024a8e
Update Makefile.win.fpga
JoeOster Aug 26, 2020
d5ce16c
Update CMakeLists.txt
JoeOster Aug 27, 2020
a8f34a5
Update CMakeLists.txt
JoeOster Aug 27, 2020
d1d4a6b
Update CMakeLists.txt
JoeOster Aug 27, 2020
2e989df
Merge pull request #5 from oneapi-src/master
JoeOster Aug 27, 2020
ca8ffa4
Merge pull request #6 from oneapi-src/master
JoeOster Sep 15, 2020
828111c
Update README.md
JoeOster Sep 15, 2020
801a485
Update README.md
JoeOster Sep 15, 2020
0bda57a
Merge pull request #7 from oneapi-src/master
JoeOster Sep 29, 2020
357d49b
Update from Legal Approval of 10/05/2020
JoeOster Oct 5, 2020
a989fad
Merge pull request #8 from oneapi-src/master
JoeOster Oct 5, 2020
01f8379
Merge pull request #9 from oneapi-src/master
JoeOster Oct 6, 2020
0f5032f
Create README.md
JoeOster Oct 6, 2020
a643759
Add files via upload
JoeOster Oct 6, 2020
29dd1b9
Merge pull request #10 from oneapi-src/master
JoeOster Oct 7, 2020
b10f0ad
Update README.md
tomlenth Oct 7, 2020
1a68b03
Update sample.json
tomlenth Oct 7, 2020
630bfb4
Update README.md
tomlenth Oct 7, 2020
c51adb5
Update sample.json
tomlenth Oct 7, 2020
e9e91b9
Merge pull request #14 from tomlenth/patch-4
JoeOster Oct 7, 2020
37cb3ea
Merge pull request #13 from tomlenth/patch-3
JoeOster Oct 7, 2020
e1aa6b2
Merge pull request #12 from tomlenth/patch-2
JoeOster Oct 7, 2020
48575d8
Merge pull request #11 from tomlenth/patch-1
JoeOster Oct 7, 2020
7316776
Update README.md
JoeOster Oct 7, 2020
4f422dc
Merge pull request #15 from oneapi-src/master
JoeOster Oct 7, 2020
1fba891
Fixing conflicts
JoeOster Nov 12, 2020
a0a7a1e
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
JoeOster Nov 12, 2020
19e5004
ng conflicts
JoeOster Nov 12, 2020
d6048d6
Create README.md
JoeOster Nov 12, 2020
4752f40
Create sample.json
JoeOster Nov 12, 2020
faa5180
Create sample.json
JoeOster Nov 12, 2020
09f07b3
Merge pull request #16 from oneapi-src/master
JoeOster Nov 12, 2020
82d33df
removing franmeworks folder
JoeOster Nov 12, 2020
c7629cd
fixing a issue
JoeOster Nov 12, 2020
e7bb3f7
Fixing an issue
JoeOster Nov 12, 2020
657dc9d
Merge pull request #17 from oneapi-src/master
JoeOster Nov 13, 2020
250aeb6
Merge pull request #18 from oneapi-src/master
JoeOster Nov 17, 2020
410b539
Update hyperlink
JoeOster Nov 18, 2020
4a46d84
update a hyperlin
JoeOster Nov 18, 2020
ea0e17b
fixing hyperlink
JoeOster Nov 18, 2020
c07f5e7
Merge pull request #19 from oneapi-src/master
JoeOster Nov 30, 2020
2d12615
updatereadme.md files w/ grammer corrections
JoeOster Nov 30, 2020
34f0d7d
updatereadme.md files w/ grammer & spelling corrections
JoeOster Nov 30, 2020
a5ae367
Audrey's edits to fpga_compile's README
akertesz Dec 1, 2020
666af56
Disambiguate "compile time" in fpga_compile README
akertesz Dec 1, 2020
084bea3
updatereadme.md files w/ grammer & spelling corrections
JoeOster Dec 1, 2020
d27d423
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
JoeOster Dec 1, 2020
546eee9
updatereadme.md files w/ grammer corrections
JoeOster Dec 2, 2020
99385df
updatereadme.md files w/ grammer corrections
JoeOster Dec 3, 2020
b06d2a6
updatereadme.md files w/ grammer corrections
JoeOster Dec 3, 2020
98b5716
updatereadme.md files w/ grammer corrections
JoeOster Dec 3, 2020
d66acf1
updatereadme.md files w/ grammer corrections
JoeOster Dec 3, 2020
2d980c5
Update README.md
JoeOster Dec 3, 2020
5f76de8
Update README.md
JoeOster Dec 3, 2020
4edbc26
Update README.md
JoeOster Dec 3, 2020
626156a
Update README.md
JoeOster Dec 3, 2020
10e66f9
Update README.md
JoeOster Dec 3, 2020
a3a4760
Update README.md
JoeOster Dec 3, 2020
05f4302
Update README.md
JoeOster Dec 3, 2020
8c1d5a2
Update README.md
JoeOster Dec 3, 2020
0c93a66
Update README.md
JoeOster Dec 3, 2020
d526355
Update README.md
JoeOster Dec 3, 2020
326b424
Update README.md
JoeOster Dec 3, 2020
5dbc017
Update README.md
JoeOster Dec 3, 2020
2ccda7f
Update README.md
JoeOster Dec 3, 2020
ef5930f
Update README.md
JoeOster Dec 3, 2020
1905d34
Update README.md
JoeOster Dec 3, 2020
6947958
Update README.md
JoeOster Dec 3, 2020
b751ab3
Update README.md
JoeOster Dec 3, 2020
91b25d2
Update README.md
JoeOster Dec 3, 2020
8eb36f4
Update README.md
JoeOster Dec 3, 2020
9ca097e
Update README.md
JoeOster Dec 3, 2020
833e738
Update README.md
JoeOster Dec 3, 2020
d675738
Update README.md
JoeOster Dec 4, 2020
a742821
Update README.md
JoeOster Dec 4, 2020
de5455d
Update README.md
JoeOster Dec 4, 2020
e6ff2aa
Update README.md
JoeOster Dec 4, 2020
37a0f51
Update README.md
JoeOster Dec 4, 2020
c2e9365
Update README.md
JoeOster Dec 4, 2020
e7c1d39
Update README.md
JoeOster Dec 4, 2020
64d87cd
Update README.md
JoeOster Dec 4, 2020
0a3d875
Update README.md
JoeOster Dec 4, 2020
7a19135
Update README.md
JoeOster Dec 4, 2020
99f844e
Update README.md
JoeOster Dec 4, 2020
8472090
Update README.md
JoeOster Dec 4, 2020
5e22274
Update README.md
JoeOster Dec 4, 2020
012edf1
Update README.md
JoeOster Dec 4, 2020
30d098b
Update README.md
JoeOster Dec 4, 2020
22ba0a4
Update README.md
JoeOster Dec 4, 2020
ea145cf
Update README.md
JoeOster Dec 8, 2020
6e7c8e6
Update README.md
JoeOster Dec 8, 2020
9a554f3
updatereadme.md files w/ corrected window run commands
JoeOster Dec 9, 2020
a2d7a29
Update README.md
JoeOster Dec 9, 2020
e15c117
Remove license files from all samples, except root
JoeOster Dec 15, 2020
3ba0e8c
Readme changes based pn Lincense file requirements
JoeOster Dec 15, 2020
918bc6b
fix conflict
JoeOster Dec 15, 2020
60d170c
Merge branch 'master' into master
JoeOster Dec 16, 2020
a3f4769
removing unused folders
JoeOster Dec 21, 2020
23ab689
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
JoeOster Dec 21, 2020
c26350d
Minor fixits to FPGA project template READMEs
akertesz Dec 29, 2020
fc15278
Fix some grammar-check-induced ambiguity in FPGA Reference Design REA…
akertesz Dec 30, 2020
d549b72
Correct name of FPGA PAC D5005 in reference design README files.
akertesz Dec 30, 2020
65de7fa
Fix errors introduced by grammar check in FPGA Design Pattern READMEs
akertesz Dec 30, 2020
0eeac70
Fix errors introduced by Grammarly in FPGA Tools and GettingStarted c…
akertesz Dec 30, 2020
0324241
Fix auto-corrections in READMEs for FPGA Features code samples
akertesz Jan 4, 2021
644ed33
repalcing license file
JoeOster Jan 4, 2021
ec8fc09
repalcing license file
JoeOster Jan 4, 2021
6a00b40
correcting formatting
JoeOster Jan 4, 2021
3e0501c
Update README.md
JoeOster Jan 4, 2021
4132612
Update README.md
JoeOster Jan 4, 2021
a518e6e
Update README.md
JoeOster Jan 4, 2021
ffc8580
Fix Intel FPGA PAC D5005 name in FPGA REAMDEs
akertesz Jan 5, 2021
e84bd70
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
akertesz Jan 5, 2021
95cf5db
Update readme titles
JoeOster Jan 8, 2021
5476be0
Updateing readme body
JoeOster Jan 8, 2021
33d1294
Update ISO2DFD readme body
JoeOster Jan 8, 2021
e13c56d
Updating hyperlinks
JoeOster Jan 12, 2021
b938bc3
Revert "Remove license filt stat
JoeOster Jan 13, 2021
551b110
Backing out License.txt removal orifinaly done on December 15
JoeOster Jan 13, 2021
81da9e2
test
JoeOster Jan 14, 2021
b859c01
Revert "Remove license files from all samples, except root"
anjgola Jan 14, 2021
2d95077
Merge pull request #22 from JoeOster/BranchWithTuedayCommit
anjgola Jan 14, 2021
e983e6d
Merge pull request #23 from oneapi-src/master
JoeOster Jan 19, 2021
e737dee
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
JoeOster Jan 19, 2021
3d6365d
Create new_sample
JoeOster Feb 4, 2021
2beefd1
Rename new_sample to new_sample.md
JoeOster Feb 4, 2021
1b862b6
Update new_sample.md
JoeOster Feb 4, 2021
c4d5d81
Update new_sample.md
JoeOster Feb 5, 2021
1b7a78e
Merge pull request #24 from oneapi-src/master
JoeOster Feb 5, 2021
d27acee
Update new_sample.md
JoeOster Feb 9, 2021
8983776
Update new_sample.md
JoeOster Feb 9, 2021
348acbf
Merge branch 'master' of https://github.com/JoeOster/oneAPI-samples
JoeOster Feb 12, 2021
4f1f3db
fix conflict
JoeOster Feb 12, 2021
147165d
Merge branch 'oneapi-src-master'
JoeOster Mar 17, 2021
28ae292
Delete Zlib_license.txt
JoeOster Mar 17, 2021
60a3d5e
Merge pull request #26 from oneapi-src/master
JoeOster Mar 17, 2021
47f3d41
dding third party license file
JoeOster Mar 17, 2021
83992ec
Merge pull request #27 from oneapi-src/master
JoeOster Mar 26, 2021
3cc9ee2
Update README.md
JoeOster Mar 26, 2021
a7fcbcf
Update README.md
JoeOster Mar 26, 2021
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
updatereadme.md files w/ grammer corrections
  • Loading branch information
JoeOster committed Dec 3, 2020
commit d66acf1d2d4744d630ea3f19cc78afdaa5c20de2
22 changes: 11 additions & 11 deletions AI-and-Analytics/End-to-end-Workloads/Census/README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,31 @@
# End-to-end machine learning workload: Census
This sample code illustrates how to use Modin for ETL operations and ridge regression algorithm from the DAAL accelerated scikit-learn library to build and run an end to end machine learning workload. It demonstrates how to use software products that can be found in the [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
This sample code illustrates how to use Modin for ETL operations, and ridge regression algorithm from the DAAL accelerated scikit-learn library to build and run an end to end machine learning workload. It demonstrates how to use software products that can be found in the [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).

| Optimized for | Description
| :--- | :---
| OS | 64-bit Linux: Ubuntu 18.04 or higher
| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable Performance Processor Family
| Software | Python version 3.7, Modin, Ray, daal4py, Scikit-Learn, NumPy, Intel® AI Analytics Toolkit
| What you will learn | How to use Modin and DAAL optimized scikit-learn (developed and owned by Intel) to build end to end ML workloads and gain performance.
| What you will learn | How to use Modin and DAAL optimized scikit-learn (developed and owned by Intel) to build an end to end ML workloads and gain performance.
| Time to complete | 15-18 minutes

## Purpose
Modin uses Ray to provide an effortless way to speed up your Pandas notebooks, scripts and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing Pandas code. Daal4py is a simplified API to Intel DAAL that allows for fast usage of the framework suited for Data Scientists and Machine Learning users. It is built to help provide an abstraction to Intel® DAAL for either direct usage or integration into one's own framework.
Modin uses Ray to provide an effortless way to speed up your Pandas notebooks, scripts and libraries. Unlike other distributed DataFrame libraries, Modin provides seamless integration and compatibility with existing Pandas code. Daal4py is a simplified API to Intel DAAL that allows for fast usage of the framework suited for Data Scientists and Machine Learning users. It is built to help provide an abstraction to Intel® DAAL for direct usage or integration into one's own framework.

#### Model and dataset
In this sample, you will use Modin to ingest and process U.S. census data from 1970 to 2010 in order to build a ridge regression based model to find the relation between education and the total income earned in the US.
Data transformation stage normalizes the income to the yearly inflation, balances the data such that each year has a similar number of data points, and extracts the features from the transformed dataset. The feature vectors are fed into the ridge regression model to predict the income of each sample.
In this sample, you will use Modin to ingest and process U.S. census data from 1970 to 2010 to build a ridge regression based model to find the relation between education and the total income earned in the US.
The data transformation stage normalizes the income to the yearly inflation, balances the data such that each year has a similar number of data points, and extracts the features from the transformed dataset. The feature vectors are fed into the ridge regression model to predict the income of each sample.

Dataset is from IPUMS USA, University of Minnesota , [www.ipums.org](https://ipums.org/) (Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020. https://doi.org/10.18128/D010.V10.0)
Dataset is from IPUMS USA, University of Minnesota, [www.ipums.org](https://ipums.org/) (Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas and Matthew Sobek. IPUMS USA: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2020. https://doi.org/10.18128/D010.V10.0)

## Key Implementation Details
This end-to-end workload sample code is implemented for CPU using the Python language. The example requires you to have Modin, Ray, daal4py, Scikit-Learn, NumPy installed inside a conda environment, similar to what is directed by the [oneAPI AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/articles/installing-ai-kit-with-conda.html) as well as the steps that follow in this README.

## License

This code sample is licensed under MIT license
This code sample is licensed under the MIT license

## Building Modin and daal4py for CPU to build and run end-to-end workload
## Building Modin and daal4py for CPU to build and run an end-to-end workload

Modin and oneAPI Data Analytics Library (DAAL) is ready for use once you finish the Intel AI Analytics Toolkit installation with the Conda Package Manager.

Expand All @@ -34,7 +34,7 @@ You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi)

### Activate conda environment With Root Access

Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script and Intel Distribution of Modin environment installation (https://software.intel.com/content/www/us/en/develop/articles/installing-ai-kit-with-conda.html). Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a super user. If you customized the installation folder, the `setvars.sh` file is in your custom folder.
Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script and Intel Distribution of Modin environment installation (https://software.intel.com/content/www/us/en/develop/articles/installing-ai-kit-with-conda.html). Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a superuser. If you customized the installation folder, the `setvars.sh` file is in your custom folder.

Activate the conda environment with the following command:

Expand Down Expand Up @@ -73,7 +73,7 @@ pip install jupyter

### Install wget package

Install wget package in order to retrieve the Census dataset using HTTPS
Install wget package to retrieve the Census dataset using HTTPS

```
pip install wget
Expand All @@ -98,7 +98,7 @@ Open .ipynb file and run cells in Jupyter Notebook using the "Run" button. Alter

### Run as Python File

Open notebook in Jupyter and download as python file (see image using "census modin" sample)
Open notebook in Jupyter and download as python file (see the image using "census modin" sample)

![Download as python file in the Jupyter Notebook](Running_Jupyter_notebook_as_Python.jpg "Download as python file in the Jupyter Notebook")

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Intel Python daal4py Distributed K-Means
This sample code shows how to train and predict with a distributed k-means model using the python API package daal4py for oneAPI Data Analytics Library. It assumes you have a working version of MPI library installed and it demonstrates how to use software products that can be found in the [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html) or [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
This sample code shows how to train and predict with a distributed k-means model using the python API package daal4py for oneAPI Data Analytics Library. It assumes you have a working version of the MPI library installed, and it demonstrates how to use software products that can be found in the [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html) or [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).

| Optimized for | Description
| :--- | :---
Expand All @@ -11,9 +11,9 @@ This sample code shows how to train and predict with a distributed k-means model

## Purpose

daal4py is a simplified API to Intel® DAAL that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel® DAAL for either direct usage or integration into one's own framework.
daal4py is a simplified API to Intel® DAAL that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel® DAAL for direct usage or integration into one's own framework.

In this sample you will run a distributed K-Means model with oneDAL daal4py library memory objects. You will also learn how to train a model and save the information to a file.
In this sample, you will run a distributed K-Means model with oneDAL daal4py library memory objects. You will also learn how to train a model and save the information to a file.

## Key Implementation Details
This distributed K-means sample code is implemented for CPU using the Python language. The example assumes you have daal4py and scikit-learn installed inside a conda environment, similar to what is delivered with the installation of the Intel(R) Distribution for Python as part of the [oneAPI AI Analytics Toolkit powered by oneAPI](https://software.intel.com/en-us/oneapi/ai-kit).
Expand All @@ -22,17 +22,17 @@ This distributed K-means sample code is implemented for CPU using the Python lan
You will need a working MPI library. We recommend to use Intel(R) MPI, which is included in the [oneAPI HPC Toolkit](https://software.intel.com/en-us/oneapi/hpc-kit).

## License
This code sample is licensed under MIT license
This code sample is licensed under the MIT license

## Building daal4py for CPU

oneAPI Data Analytics Library is ready for use once you finish the Intel AI Analytics Toolkit installation, and have run the post installation script.
oneAPI Data Analytics Library is ready for use once you finish the Intel AI Analytics Toolkit installation and have run the post installation script.

You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation, and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.

### Activate conda environment With Root Access

Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script. Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a super user. If you customized the installation folder, the `setvars.sh` file is in your custom folder.
Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script. Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a superuser. If you customized the installation folder, the `setvars.sh` file is in your custom folder.

Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command:

Expand Down Expand Up @@ -80,9 +80,9 @@ Run the Program

`mpirun -n 4 python ./IntelPython_daal4py_Distributed_Kmeans.py`

The output of the script will be saved in the included models and results directories.
The output of the script will be saved in the included models and result directories.

_Note: This code samples focuses on how to use daal4py to do distributed ML computations on chunks of data. The `mpirun` command above will only run on single local node. In order to launch on a cluster, you will need to create a host file on the master node among other steps. The **TensorFlow_Multinode_Training_with_Horovod** code sample explains this process well._
_Note: This code samples focus on using daal4py to do distributed ML computations on chunks of data. The `mpirun` command above will only run on a single local node. To launch on a cluster, you will need to create a host file on the master node, among other steps. The **TensorFlow_Multinode_Training_with_Horovod** code sample explains this process well._

##### Expected Printed Output (with similar numbers, printed 4 times):
```
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Intel Python daal4py Distributed Linear Regression
This sample code shows how to train and predict with a distributed linear regression model using the python API package daal4py for oneAPI Data Analytics Library. It assumes you have a working version of MPI library installed and it demonstrates how to use software products that can be found in the [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html) or [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
This sample code shows how to train and predict with a distributed linear regression model using the python API package daal4py for oneAPI Data Analytics Library. It assumes you have a working version of the MPI library installed, and it demonstrates how to use software products that can be found in the [Intel oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html) or [Intel AI Analytics Toolkit powered by oneAPI](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).

| Optimized for | Description
| :--- | :---
Expand All @@ -11,30 +11,30 @@ This sample code shows how to train and predict with a distributed linear regres

## Purpose

daal4py is a simplified API to Intel® DAAL that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel® DAAL for either direct usage or integration into one's own framework.
daal4py is a simplified API to Intel® DAAL that allows for fast usage of the framework suited for Data Scientists or Machine Learning users. Built to help provide an abstraction to Intel® DAAL for direct usage or integration into one's own framework.

In this sample you will run a distributed Linear Regression model with oneDAL daal4py library memory objects. You will also learn how to train a model and save the information to a file.
In this sample, you will run a distributed Linear Regression model with oneDAL daal4py library memory objects. You will also learn how to train a model and save the information to a file.

## Key Implementation Details
This distributed linear regression sample code is implemented for CPU using the Python language. The example assumes you have daal4py and scikit-learn installed inside a conda environment, similar to what is delivered with the installation of the Intel(R) Distribution for Python as part of the [oneAPI AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit).
This distributed linear regression sample code is implemented for the CPU using the Python language. The example assumes you have daal4py and scikit-learn installed inside a conda environment, similar to what is delivered with the installation of the Intel(R) Distribution for Python as part of the [oneAPI AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit).


## Additional Requirements
You will need a working MPI library. We recommend to use Intel(R) MPI, which is included in the [oneAPI HPC Toolkit](https://software.intel.com/en-us/oneapi/hpc-kit).

## License
This code sample is licensed under MIT license
This code sample is licensed under the MIT license

## Building daal4py for CPU

oneAPI Data Analytics Library is ready for use once you finish the Intel AI Analytics Toolkit installation, and have run the post installation script.
oneAPI Data Analytics Library is ready for use once you finish the Intel AI Analytics Toolkit installation and have run the post installation script.

You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation, and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.


### Activate conda environment With Root Access

Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script. Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a super user. If you customized the installation folder, the `setvars.sh` file is in your custom folder.
Please follow the Getting Started Guide steps (above) to set up your oneAPI environment with the `setvars.sh` script. Then navigate in Linux shell to your oneapi installation path, typically `/opt/intel/oneapi/` when installed as root or sudo, and `~/intel/oneapi/` when not installed as a superuser. If you customized the installation folder, the `setvars.sh` file is in your custom folder.

Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command:

Expand Down Expand Up @@ -83,9 +83,9 @@ Run the Program

`mpirun -n 4 python ./IntelPython_daal4py_Distributed_LinearRegression.py`

The output of the script will be saved in the included models and results directories.
The output of the script will be saved in the included models and result directories.

_Note: This code samples focuses on how to use daal4py to do distributed ML computations on chunks of data. The `mpirun` command above will only run on single local node. In order to launch on a cluster, you will need to create a host file on the master node among other steps. The **TensorFlow_Multinode_Training_with_Horovod** code sample explains this process well._
_Note: This code samples focus on using daal4py to do distributed ML computations on chunks of data. The `mpirun` command above will only run on a single local node. To launch on a cluster, you will need to create a host file on the master node, among other steps. The **TensorFlow_Multinode_Training_with_Horovod** code sample explains this process well._

##### Expected Printed Output (with similar numbers, printed 4 times):
```
Expand Down
Loading