70
Views
0
CrossRef citations to date
0
Altmetric
SI-Novel Approaches for Distributed Intelligent Systems

Block size, parallelism and predictive performance: finding the sweet spot in distributed learning

, , , &
Pages 379-398 | Received 10 Feb 2023, Accepted 12 Jun 2023, Published online: 27 Jun 2023

References

  • Zhou ZH. Machine learning. Singapore: Springer Nature; 2021.
  • Onan A. Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. J King Saud Univ Comput Inform Sci. 2022;34(5):2098–2117. doi:10.1016/j.jksuci.2022.02.025
  • Onan A. An ensemble scheme based on language function analysis and feature engineering for text genre classification. J Inform Sci. 2018;44(1):28–47. doi: 10.1177/0165551516677911
  • Onan A. Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract Exper. 2021;33(23):e5909. doi: 10.1002/cpe.v33.23
  • Zhou L, Pan S, Wang J, et al. Machine learning on big data: opportunities and challenges. Neurocomputing. 2017;237:350–361. doi: 10.1016/j.neucom.2017.01.026
  • Ray S. A Quick Review of Machine Learning Algorithms 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon). Faribadad, India: IEEE; 2019. p. 35–39. doi:10.1109/COMITCon.2019.8862451
  • Dubey R, Gunasekaran A, Childe SJ, et al. Big data and predictive analytics and manufacturing performance: integrating institutional theory, resource-based view and big data culture. British J Manag. 2019;30(2):341–361. doi: 10.1111/bjom.2019.30.issue-2
  • Mohammadi M, Al-Fuqaha A, Sorour S, et al. Deep learning for iot big data and streaming analytics: a survey. IEEE Commun Surv & Tutor. 2018;20(4):2923–2960. doi: 10.1109/COMST.9739
  • Gomes HM, Read J, Bifet A, et al. Machine learning for streaming data: state of the art, challenges, and opportunities. ACM SIGKDD Explorations Newsletter. 2019;21(2):6–22. doi: 10.1145/3373464.3373470
  • Xu C, Du X, Fan X, et al. Cloud-based storage and computing for remote sensing big data: a technical review. Int J Digital Earth. 2022;15(1):1417–1445. doi: 10.1080/17538947.2022.2115567
  • Drosou M, Jagadish HV, Pitoura E, et al. A review. Big Data. 2017;5(2):73–84. doi: 10.1089/big.2016.0054
  • De S, Panjwani M. A comparative study on distributed file systems. In: Modern approaches in machine learning and cognitive science: a walkthrough: latest trends in AI, volume 2. Springer; 2021. p. 43–51.
  • Andrews GR. Foundations of multithreaded, parallel, and distributed programming. Boston, USA: Addison-Wesley; 2020.
  • Verbraeken J, Wolting M, Katzy J, et al. A survey on distributed machine learning. Acm Comput Surv (csur). 2020;53(2):1–33. doi: 10.1145/3377454
  • Onan A, Korukoğlu S, Bulut H. Ensemble of keyword extraction methods and classifiers in text classification. Expert Syst Appl. 2016;57:232–247. doi: 10.1016/j.eswa.2016.03.045
  • Schelter S, Biessmann F, Januschowski T, et al. On challenges in machine learning model management. 2015.
  • Pauloski JG, Zhang Z, Huang L, et al. Convolutional neural network training with distributed k-fac. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis; IEEE; 2020. p. 1–12.
  • Dong X, Yu Z, Cao W, et al. A survey on ensemble learning. Front Comput Sci. 2020;14(2):241–258. doi: 10.1007/s11704-019-8208-z
  • Pham H, Olafsson S. Bagged ensembles with tunable parameters. Comput Intell. 2019;35(1):184–203. doi: 10.1111/coin.v35.1
  • Roelofs R, Shankar V, Recht B, et al. A meta-analysis of overfitting in machine learning. Adv Neural Inf Process Syst. 2019;32; https://proceedings.neurips.cc/paper_files/paper/2019/file/ee39e503b6bedf0c98c388b7e8589aca-Paper.pdf.
  • Rajagopal S, Kundapur PP, Hareesha KS. A stacking ensemble for network intrusion detection using heterogeneous datasets. Secur Commun Netw. 2020;2020:1–9. doi: 10.1155/2020/4586875
  • González S, García S, Del Ser J, et al. A practical tutorial on bagging and boosting based ensembles for machine learning: algorithms, software tools, performance study, practical perspectives and opportunities. Inform Fusion. 2020;64:205–237. doi: 10.1016/j.inffus.2020.07.007
  • Liu J, Huang J, Zhou Y, et al. From distributed machine learning to federated learning: a survey. Knowl Inf Syst. 2022;64(4):885–917. doi: 10.1007/s10115-022-01664-x
  • Seeger M, Ultra-Large-Sites S. Key-value stores: a practical overview. Stuttgart: Computer Science and Media; 2009.
  • Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST); IEEE; 2010. p. 1–10.
  • Attiya H. Concurrency and the principle of data locality. IEEE Distrib Syst Online. 2007;8(9):3–3. doi: 10.1109/MDSO.2007.4370099
  • Botchkarev A. A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdiscip J Inform Knowl Manag. 2019;14:045–076. doi:10.28945/4184
  • Monteiro, J., Oliveira, Ó., Carneiro, D. Task Scheduling with Makespan Minimization for Distributed Machine Learning Ensembles. In: 2022 IEEE 4th Eurasia Conference on IOT, Communication and Engineering (ECICE). IEEE; 2022. p. 435–438. doi:10.1109/ECICE55674.2022.10042894
  • Carneiro D, Guimarães M, Silva F, et al. A predictive and user-centric approach to machine learning in data streaming scenarios. Neurocomputing. 2021;484:238–249. doi: 10.1016/j.neucom.2021.07.100
  • Carneiro D, Guimarães M, Carvalho M, et al. Using meta-learning to predict performance metrics in machine learning problems. Expert Syst. 2021;40:e12900), doi:10.1111/exsy.12900
  • Ramos D, Carneiro D, Novais P. Using evolving ensembles to deal with concept drift in streaming scenarios. In: proceedings of the 14th International Symposium on Intelligent Distributed Computing (IDC 2021); (Studies in Computational Intelligence; Vol. 1026). Springer; 2022.
  • Thieu NV. Permetrics: A framework of performance metrics for artificial intelligence models; 2020. doi:10.5281/zenodo.3951205
  • Tillman RE. Structure learning with independent non-identically distributed data. In: Proceedings of the 26th Annual International Conference on Machine Learning; 2009. p. 1041–1048.
  • Angelov PP, Soares EA, Jiang R, et al. Explainable artificial intelligence: an analytical review. Wiley Interdiscip Rev: Data Mining Knowl Discov. 2021;11(5):e1424), doi:10.1002/widm.1424
  • Ribeiro MT, Singh S, Guestrin C. “Why should i trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining; 2016. p. 1135–1144.
  • Sundararajan M, Najmi A. The many shapley values for model explanation. In: International conference on machine learning; PMLR; 2020. p. 9269–9278.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.