Something is Strange with the Google Pixel 7 Pro’s Zoom Processing


The Google Pixel 7 Pro introduces a lot of new software tactics to get better images, particularly at various zoom levels. I did some detailed testing and noticed something curious with the zoom processing.

How Google’s Super Res Zoom Works

For the uninitiated, Super Res Zoom is Google’s magic to make a zoom shot better than simply cropping an image. It uses the shaking of your hand to gather more information about the thing you’re taking a picture of.

This is important because when you hold the camera 100% still (such as putting it up against a window), the phone will artificially engage the optical image stabilization (OIS) motor in a motion to simulate a slight hand shake. This is important and I used this in the testing to determine when Super Res Zoom is active.

Here’s a short video that shows this — notice how shake starts at 1.5x and stops at 5x:

Zoom with the Main Sensor

It appears Super Res Zoom is not active up to 1.5x zoom. I took a screen recording of the camera so I could study the viewfinder closely, and when at 1.5x zoom and below, there is no artificial motion being introduced.

Above 1.5x, it starts shaking the camera module for you. I believe this used to start at 2x zoom in previous Pixels, so they have decreased the limit here. That means 1x – 1.5x is still just a crop, but even at 1.5x the resulting image is still 12.5 MP so they’re filling in missing pixels through traditional interpolation.

The camera shake is still present at 2x zoom. So even though they are cropping the middle pixels from the sensor, they are still using the Super Res Zoom technology from before in conjunction. So, then the question might be “Would a 1.9x shot look a lot less detailed than a 2x shot?”

Well, I tested this multiple times with a completely stabilized phone and still objects, and… yes.

1.9x is quite a bit worse than 2x if you crop in on the details. From just looking at the full-size images side-by-side on a large monitor, you don’t really notice. But when you zoom in, there is definitely a difference. Take a look at these 2x and 1.9x shots:

The other thing is that the 2x shots consistently took up about 2.5 MB more space than the 1.9x shots (about 30% more space), every single time. This further supports the idea that the 2x shots have more information. So, in other words, if you are looking to zoom around 2x, just use 2x. Anything below that results in a loss of quality.

Just for kicks, I also tested 2.1x zoom, and it looks nearly identical to 2x (even though the 2.1x shot also took up 3.5 MB less than the 2x shot for some odd reason). So essentially, anything below 2x gets nerfed, and anything below 1.5x gets extremely nerfed.

However, I decided to test that last part too, and the difference between 1.4x (no Super Res Zoom) and 1.9x (with traditional Super Res Zoom) was extremely small. Here is that comparison:

Augmented Main Camera

At zoom levels above 2x, Google claims to use the telephoto lens to augment the main lens. However, the telephoto lens can’t see everything the main lens can. So, wouldn’t that mean that the center of the image will be substantially better quality than the edges? Well, I tested this too.

The answer, unequivocally, is yes. In fact, there is a clear square in the middle of the image where the image is substantially better quality than the rest. Take a look at this “3x” photo with the yellow box I drew in the middle, which highlights where this quality difference is. You will need to zoom in, but you’ll definitely see it.

Here’s a closer crop showing the difference between what’s inside the yellow box and what is outside:

A closer crop showing the details and quality difference in the 3x zoom photo above. Left is outside the yellow box and right is inside.

However, the color profile of the telephoto is fairly different (cooler) than the main sensor, so they seem to have corrected for that in post to prevent the middle of the image from looking like a different color from the rest. I have a “5x telephoto” shot just to give you a reference of what the telephoto lens was seeing, and you can see it pretty much lines up with the square I drew, but with a different color temperature.

I wonder if they could do a similar thing for 1x – 2x, where they use the middle pixels for the center of the image to augment the edges being pixel-binned on the main sensor. However, this might be really difficult to pull off. I didn’t notice any square in the middle being more detailed than the edges in the main sensor images, so I doubt they are doing this.

I wonder if some super genius could come up with an algorithm where they take both pixel-binned shots and full 50 MP shots and combine them to increase both resolution and dynamic range.

Zoom with the Telephoto Lens

So, here’s the weird thing. At no point does the telephoto lens intentionally move the motor in the OIS for you when you are stabilized, regardless of zoom level. Yet, they’re almost certainly using Super Res Zoom to achieve that 30x zoom, so how are they doing it? Are they assuming that at that zoom level the user won’t be holding the camera steady regardless?

I tested at 9.8x zoom and 10x zoom and, surprisingly, there was actually no difference, unlike for the main sensor, even though Google said that it was cropping the middle pixels at 10x zoom.

In general, the lack of the OIS motor movement and the lack of quality improvement at 10x makes it seem like they forgot to implement Super Res Zoom in the telephoto lens.

Take a look at these 5x crop, 12x crop, and 30x crop images:

The 12x crop and the 30x crop look nearly identical. The 5x crop only looks bad because it is such a ridiculous crop that there are barely any pixels in the image, whereas the other two appear to just be upscaled versions. Now Google says the upscaling “uses machine learning”, but why not use their own superior zoom technology? It’s like Super Res Zoom isn’t enabled for the telephoto.

Here is a crop of a 5x shot with a crop and a 10x shot:

The 10x shot does look better, even though the difference isn’t nearly as much as with the main sensor. Again, this must be due to the “machine learning upscaling” but what isn’t adding up is why 9.8x and 10x look so similar.

I also tested whether lighting made a difference in how these lenses are engaged with a 9x crop vs an 11x crop at a different time of day.

They look fairly similar to my eyes. I mean there are some differences, but nothing like the difference between 1.9x and 2x, which is quite stark.

It may also be possible that Google is intentionally canceling out any intentional OIS motor manipulation and hand shake in the viewfinder so that the image looks stable. Otherwise, it might look really shaky to the person holding the phone. They did say in the keynote that they are implementing strong stabilization.


About the author: Aashish Bharadwaj is a software developer based in the United States who is also a passionate photography enthusiast.



Source link

Leave a Comment