R1-2409482
discussion
Discussion on study for AI/ML CSI compression
From ZTE
ZTE's prior position on
9.1.4.1
at
RAN1#118bis
· AI-synthesized, paraphrased
verify sources →
Advocates for prioritizing Option 3 (standardized reference model structure with parameter exchange) using NW-first training and over-the-air delivery, while opposing Option 4 due to dataset exchange overhead concerns.
Summary
ZTE analyzes inter-vendor training collaboration options for AI/ML-based CSI compression in NR Release 19, focusing on Directions A (UE-side offline engineering), B (on-device operation), and C (fully standardized reference model). The document presents 32 proposals and 10 observations, arguing for the down-selection of Case 2 vs Case 3, deferring specification impact analysis until feasibility studies conclude, and highlighting data distribution mismatch issues resolved by mixed-dataset training or associated IDs.
Position
ZTE proposes conducting comparisons between Case 2 and Case 3 for potential down selection to reduce specification impact analysis efforts, and deferring specification impact analysis for inter-vendor training collaboration until feasibility studies conclude. For Direction A, ZTE proposes sharing performance targets and model backbone information (if proprietary concerns are maintained) from NW to UE, and resolving data distribution mismatch via associated ID indications or NW-side timely data collection. For Direction B, ZTE argues that training multiple UE-specific encoders is infeasible due to proprietary risks, favoring universal encoders despite potential performance sacrifices, and proposes using associated IDs and continuous monitoring to address data distribution mismatch. For Direction C, ZTE supports using synthetic data from 3GPP statistical channel models as a starting point, with model retraining on real-world data to bridge distribution gaps. Regarding remaining issues, ZTE proposes studying Enhanced Rel-16 eTypeII codebook designs for high-resolution CSI, prioritizing NW-side monitoring based on target CSI with realistic channel estimation, and deprioritizing UE-side monitoring in Rel-19.
Key proposals
- Proposal 1 (General views): To reduce specification impact analysis efforts, conduct comparison between Case 2 and Case 3 for potential down selection.
- Proposal 2 (General views): Conduct further evaluations and comparisons of different inter-vendor training collaboration options for feasibility study and potential down selection.
- Proposal 3 (General views): Defer specification impact analysis related to inter-vendor training collaboration until feasibility study and comparison of different options are concluded.
- Proposal 4 (Direction A, Issue 1): For sub-option 3a-1, at least performance target should be shared from NW-side to UE-side to enable UE-side encoder training, validation, and testing.
- Proposal 5 (Direction A, Issue 1): For sub-option 4-1, performance target and model backbone information should be shared from NW-side to UE-side to enable UE-side encoder training, validation, and testing.
- Proposal 6 (Direction A, Issue 2): For sub-option 4-1, model backbone or structure information can only be shared if the proprietary information of the NW side can be maintained.
- Proposal 7 (Direction A, Issue 4): Data distribution mismatch with respect to NW-side additional conditions can be resolved by the indication of associated ID which implicitly abstracts the NW-side additional conditions.
- Proposal 8 (Direction B, Issue 3): Option 3b is an attractive option from overhead consumption perspective as encoder size is much smaller than training dataset and parameter transfer is not expected to occur frequently within a cell.
- Proposal 9 (Direction B, Issue 5): It may be infeasible for the NW side to train multiple encoders tailored to different UEs due to potential UE proprietary information disclosure risks and offline collaboration efforts.
- Proposal 10 (Direction B, Issue 6): Data categorization using associated IDs and continuous monitoring can be considered for addressing data distribution mismatch issues in Direction B.
- Proposal 11 (Direction C, Issue 8): Synthetic data generated under 3GPP’s statistical channel model can be a starting point for reference model training.
- Proposal 12 (Direction C, Issue 9): Model retraining based on data collected in real world can be performed to bridge the gap between synthetic data and field data.
- Proposal 13 (Data collection): Support further study of Enhanced Rel-16 eTypeII codebook design to achieve high-resolution CSI for model training and performance monitoring.
- Proposal 14 (Data collection): To enable high-quality data collection, support UE reporting data quality related information (e.g., SINR, CQI) to NW, and NW configuring a threshold of data quality to UE.
- Proposal 15 (Performance monitoring): Deprioritize the study on UE-side monitoring in Rel-19 study phase for CSI compression using two-sided model use case.