Multi-criteria decision-making and data-driven multivariate models were employed to investigate suitability of agricultural land to grow legume crops, primarily relying on remotely sensed data. Machine learning models require well-balanced datasets and may exhibit a higher degree of uncertainty in predicting suitability in areas where crops are not typically grown or underrepresented in the data. Over 70% of the suitable class coverage is observed within a 2 km radius from the river or its tributaries due to higher soil moisture and potentially higher nutrient levels found in soil types near rivers.