Optical neural networks (ONNs) are gaining increasing attention to accelerate machine learning tasks. In particular, static meta-optical encoders designed for task-specific preprocessing have demonstrated orders of magnitude smaller energy consumption over purely digital counterparts, albeit at the cost of a slight degradation in classification accuracy. However, a lack of generalizability poses serious challenges for wide deployment of static meta-optical front-ends. Here, we investigate the utility of a single-layer metalens as a meta-optical encoder in ONNs for generalizable image classification. Specifically, we show that a visible-spectrum broadband metalens can achieve image classification accuracy comparable to high-end, sensor-limited optics and consistently outperforms the corresponding hyperboloid baseline across a wide range of sensor pixel sizes and digital backends. We further design an end-to-end optimized single-aperture metasurface for ImageNet classification and observe that the optimization tends to balance the modulation transfer function (MTF) across wavelengths within the sensor-detectable passband. Together, these observations suggest that the preservation of spatial-frequency information is an important factor influencing the performance of ONNs. Our results provide physical insight into the process of task-driven optical optimization and offer practical guidance for the design of high-performance ONNs and meta-optical encoders for generalizable computer-vision tasks.