You’re very welcome, António, thanks for the kind feedback.
The number of clones, or points, will define the resolution. Typically if the sound is pleasant to the ears, the resolution is large, which means there is plenty to work with.
If you can produce images, the amount of details is then based on how much the surface can be locally moved (points)to get this effect. Besides this, there are some techniques that would just to simulate a detailed surface. Those methods would be Bump-Mapping or Normal Mapping. These two would certainly allow for a very fine detailing with outgoing extreme with polygons/points. However, those methods have their limitations, as they do not really produce any real change to the surface, it just looks that way.
Image files are based on several parameters, typically this is a resolution in pixels, as well as color depth or just bit depth per channel. With png you would be not well of with 8bit per channel, 16 bit per channel should be fine. However, with those, you would need to know how any color profile might distort your content, hence why I typically use OpenEXR, I’m not really into gamma based images for that. Music has no gamma, and to apply a gamma curve on that would only mean that the values are shifted to be stored into and then shifted back to get it out of the gamma space. Often I see sRGB in use, which is at best a delivery format, and never ever should it be used inside a production – if high quality is the target. But with no doubt, this shouldn’t be used on data storing images, like the one you have in mind, or others like alpha, normal, depth, etc. Hence, again, linear formats. This takes a lot of limitations out of the equation and puts pure quality into it, as 16 bit/channel or 32bit channel float.
You’re right, compression has no idea, that data is stored instead of pictures, that would be a big problem, except perhaps for anything that is lossless, like a run-length algorithm.
My best wishes for your project.