1)kaggle-https://www.kaggle.com/datasets , 𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚔𝚊𝚐𝚐𝚕𝚎𝚍𝚊𝚝𝚊𝚜𝚎𝚝𝚜
Downloading Kaggle datasets directly into Google Colab -https://towardsdatascience.com/downloading-kaggle-datasets-directly-into-google-colab-c8f0f407d73a
How to Download Kaggle Datasets using Jupyter Notebook https://www.analyticsvidhya.com/blog/2021/04/how-to-download-kaggle-datasets-using-jupyter-notebook/
2)https://sebastianraschka.com/blog/2021/ml-dl-datasets.html
movielens-https://grouplens.org/datasets/movielens/latest/
3)data.gov-https://data.gov.in/
4)uci-https://archive.ics.uci.edu/ml/datasets.php https://github.com/tirthajyoti/UCI-ML-API
5)Group Lens dataset https://grouplens.org/
Wikipedia ML Datasets https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
6)world3bank https://data.world/ , worldbank
7)Google Cloud BigQuery public datasets
Google Public Datasets-cloud.google.com/bigquery/public-data/
Google Cloud Data Catalog https://cloud.google.com/data-catalog
Academic Torrents-https://academictorrents.com/check.htm?returnto=%2Fbrowse.php
8)online hacktons
Datasets https://www.paperswithcode.com/datasets
9)image data from google_images_download
https://www.visualdata.io/discovery
https://xviewdataset.org/#dataset
https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html
10)image data from Bing_Search
image data from simple_image_download https://github.com/RiddlerQ/simple_image_download
11)https://www.columnfivemedia.com/100-best-free-data-sources-infographic
12)Reddit:https://lnkd.in/dv5UCD4 https://www.reddit.com/r/datasets/
13)https://datasets.bifrost.ai/?ref=producthunt
14)data.world:https://lnkd.in/gEK897K
15)https://data.world/datasets/open-data
https://tinyletter.com/data-is-plural
16)FiveThirtyEight :- https://lnkd.in/gyh-HDj , https://data.fivethirtyeight.com/
17)BuzzFeed :- https://lnkd.in/gzPWyHj
Buzzfeed News -github.com/BuzzFeedNews
Socrata - https://opendata.socrata.com/
18)Google public datasets :- https://lnkd.in/g5dH8qE
Statistics Canada https://www.statcan.gc.ca/eng/start https://towardsdatascience.com/how-to-collect-data-from-statistics-canada-using-python-db8a81ce6475
Deep Image Search AI-based image search engine https://github.com/TechyNilesh/DeepImageSearch
https://www.datasciencecentral.com/profiles/blogs/big-data-sets-available-for-free
19)Quandl :- https://www.quandl.com stock data
statista : https://www.statista.com/ stock data
20)socorateopendata :- https://lnkd.in/gea7JMz
21)AcedemicTorrents :- https://lnkd.in/g-Ur9Xy
22) Automates Image Annotation for Deep Learning Models https://medium.com/towards-artificial-intelligence/improving-data-labeling-efficiency-with-auto-labeling-uncertainty-estimates-and-active-learning-5848272365be
https://neptune.ai/blog/annotation-tool-comparison-deep-learning-data-annotation?utm_source=linkedin&utm_medium=post&utm_campaign=blog-annotation-tool-comparison-deep-learning-data-annotation
Diffgram,Label Studio ,CVAT,SuperAnnotate,Datasaur https://anthony-sarkis.medium.com/the-5-best-ai-data-annotation-platforms-for-machine-learning-2021-ec17c15142f3
https://foobar167.medium.com/open-source-free-software-for-image-segmentation-and-labeling-4b0332049878
***Label Assist: Model Assisted Pre-Annotation for Computer Vision https://blog.roboflow.com/announcing-label-assist/ https://www.youtube.com/watch?v=919CihTlkZw&feature=youtu.be***
https://github.com/jsbroks/awesome-dataset-tools
jupyter-innotater data annotator for Jupyter notebooks https://github.com/ideonate/jupyter-innotater
semi-auto-image-annotation-tool https://github.com/virajmavani/semi-auto-image-annotation-tool
labelimage:- https://github.com/wkentaro/labelme , https://github.com/tzutalin/labelImg
labelCloud lightweight tool for labeling 3D bounding boxes in point clouds https://github.com/ch-sa/labelCloud
labeller https://www.labellerr.com/
prodigy Radically efficient machine teaching An annotation tool powered by active learning https://prodi.gy/
Labelbox-https://labelbox.com/
Playment-https://playment.io/
SuperAnnotate -https://www.superannotate.com/
CVAT-https://github.com/openvinotoolkit/cvat
Lionbridge- https://lionbridge.ai/
LinkedAI: A No-code Data Annotations- https://analyticsindiamag.com/linkedai/
Dataturks
V7 Darwin The Rapid Image Annotator https://docs.v7labs.com/docs/loading-a-dataset-in-python https://github.com/v7labs/darwin-py#usage-as-a-python-library
https://waliamrinal.medium.com/top-and-easy-to-use-open-source-image-labelling-tools-for-machine-learning-projects-ffd9d5af4a20
https://github.com/heartexlabs/awesome-data-labeling
Label a Dataset with a Few Lines of Code https://eric-landau.medium.com/label-a-dataset-with-a-few-lines-of-code-45c140ff119d
https://analyticsindiamag.com/complete-guide-to-data-labelling-tools/ https://neptune.ai/blog/data-labeling-software
Extraction of Objects In Images and Videos Using 5 Lines of Code https://towardsdatascience.com/extraction-of-objects-in-images-and-videos-using-5-lines-of-code-6a9e35677a31
https://neptune.ai/blog/data-labeling-software?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-data-labeling-software
23)tensorflow_datasets as tfds https://www.tensorflow.org/datasets (import tensorflow_datasets as tfds)
https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
24)https://datasets.bifrost.ai/?ref=producthunt
25)https://ourworldindata.org/
26)https://data.worldbank.org/
27)google open images:https://storage.googleapis.com/openimages/web/download.html
30 Largest TensorFlow Datasets for Machine Learning https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
https://cloud.google.com/bigquery/public-data/ https://towardsdatascience.com/bigquery-public-datasets-936e1c50e6bc
https://christopherzita.medium.com/how-to-download-google-images-using-python-2021-82e69c637d59
28)https://data.gov.in/
29)imagenet dataset-https://www.image-net.org/
30)https://parulpandey.com/2020/08/09/getting-datasets-for-data-analysis-tasks%e2%80%8a-%e2%80%8aadvanced-google-search/
31)https://storage.googleapis.com/openimages/web/index.html ,
https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&r=false&c=%2Fm%2F09qck
https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset&_ga=2.35328417.1459465882.1589693499-869920574.1589693499
https://catalog.data.gov/dataset?groups=education2168#topic=education_navigation
https://vincentarelbundock.github.io/Rdatasets/datasets.html
32)coco dataset https://cocodataset.org/#explore
33)huggingface datasets-https://github.com/huggingface/datasets https://huggingface.co/datasets https://huggingface.co/languages
pip install datasets
34)Big Bad NLP Database-https://datasets.quantumstat.com/
fast.ai Datasets https://course.fast.ai/datasets
https://github.com/niderhoff/nlp-datasets
600 NLP Datasets and Glory https://pub.towardsai.net/600-nlp-datasets-and-glory-4b0080bf5ab
nlp-datasets https://github.com/karthikncode/nlp-datasets
https://analyticsindiamag.com/15-most-important-nlp-datasets/ https://medium.com/ai-in-plain-english/25-free-datasets-for-natural-language-processing-57e407402c60
35)https://www.edureka.co/blog/25-best-free-datasets-machine-learning/
36)bigquery public dataset ,Google Public Data Explorer
https://cloud.google.com/public-datasets https://guides.library.cmu.edu/machine-learning/datasets
37)inbuilt library data eg:iris dataset,mnist dataset,etc...
pandas-datareader https://github.com/pydata/pandas-datareader
tf.data.Datasets for TensorFlow Datasets
38)https://data.gov.sg/ https://data.gov.au/ https://data.europa.eu/euodp/en/data https://data.europa.eu/euodp/en/data https://data.govt.nz/
data.gov.be ,data.egov.bg/ ,data.gov.cz/english ,portal.opendata.dk,govdata.de,opendata.riik.ee,data.gov.ie,data.gov.gr,datos.gob.es,data.gouv.fr,data.gov.hr
dati.gov.it,data.gov.cy,opendata.gov.lt,data.gov.lv,data.public.lu,data.gov.mt,data.overheid.nl,data.gv.at,danepubliczne.gov.pl,dados.gov.pt,data.gov.ro,podatki.gov.si
data.gov.sk,avoindata.fi,oppnadata.se,https://data.adb.org/ ,https://data.iadb.org/ ,https://www.weforum.org/agenda/2018/03/latin-america-smart-cities-big-data/
https://data.fivethirtyeight.com/ , https://wiki.dbpedia.org/ ,https://www.europeandataportal.eu/en ,https://data.europa.eu/ ,https://www.census.gov/,
https://www.who.int/data/gho ,https://data.unicef.org/open-data/ ,https://data.un.org/ ,https://data.oecd.org/ ,https://data.worldbank.org/
39.Awesome Public Dataset- https://github.com/awesomedata/awesome-public-datasets
Get OpenML’s Dataset in One Line of Code https://mathdatasimplified.com/2021/04/23/fetch_openml-get-openmls-dataset-in-one-line-of-code/
https://github.com/the-pudding/data
datasets https://github.com/benedekrozemberczki/datasets
kdnuggets https://www.kdnuggets.com/datasets/index.html
Hub https://github.com/activeloopai/Hub
40.Datasets for Machine Learning on Graphs-https://ogb.stanford.edu/
41.https://www.johnsnowlabs.com/data/
42.30 largest tensorflow datasets-https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
43. coco dataset-https://cocodataset.org/#home
проверьте мой репозиторий для более подробной информации