This dataset is composed of 152 proteins. It is intended to perform unsupervised and supervised studies such as: variability analysis of the biomacromolecular descriptors based on Shannon Entropy, principal components analysis and correlation analysis of the algebraic form-based descriptors with structural parameters: radius of gyration (RG), Occluded surface packing (OSP) and folding degree (I3).