Good afternoon.

1. There is a set of images-the originals (1000х500рх say)

2. There is a set of reduced (500х250) images of claim 1.

I need everyone to original find smaller copies of it.

Prompt, in what direction to dig

1. To reduce the image of claim 1 to the size 500х250 and compared with the set from paragraph 2?

2. How is this comparison? It is necessary to translate image into a binary code and compare binary codes?

3. Again, images**are the same **, just in the same folder as the originals, and the other reduced versions of them.

Thank you.

1. There is a set of images-the originals (1000х500рх say)

2. There is a set of reduced (500х250) images of claim 1.

I need everyone to original find smaller copies of it.

Prompt, in what direction to dig

1. To reduce the image of claim 1 to the size 500х250 and compared with the set from paragraph 2?

2. How is this comparison? It is necessary to translate image into a binary code and compare binary codes?

3. Again, images

Thank you.

asked March 23rd 20 at 19:30

2 answers

answered on March 23rd 20 at 19:32

Not the fact that reducing an image will help, because one extra percentage compression of the avalanche will change the file hash.

Look system to find similar images based on AI, such definitely should be.

Look system to find similar images based on AI, such definitely should be.

answered on March 23rd 20 at 19:34

As a mathematician, I have to say that the solution to this problem exists. (This is a joke, understandable to mathematicians)

Only the word "identity" should be thrown out of the conditions. They don't make sense and make the task unsolvable. When the reduced image, first worked some kind of interpolation algorithm, and such algorithms with a dozen, maybe more, each will give a slightly different picture on the output. Then worked the compression algorithm with unknown parameters. I.e., varying the parameters from one original to make thousands (and even millions) of different little thumbnails. They are not completely identical (although many eyes are not distinguishable). You will need a lot of luck (not counting heaps of effort and heaps of time) to get to select all the parameters. So forget about "identity", we will be strict mathematical methods**to search for the most similar** pictures. If "similarity" has not exceeded some threshold (conventionally, 97.1%, or 99.5%), then we say that similar is not found at all.

We compare the screen of the same resolution, so either a large decrease or a small increase. And here it is not obvious that it is more efficient to compare a little (since we lose some information in exchange for a speed comparison and a more simple algorithm), it is worthy of a separate study, but for simplicity it will reduce the originals to compare only thumbnails.

So the comparison (of course, in a loop for all images). Pixel by pixel subtract one image from another. In the case of "full identity," the difference should appear as a fully black background. This we will not. But we have to look "dirty" the difference of the images (actually the "absolute difference", I only for ease of explanation say "the difference"). The less bright pixels in the difference, the more similar images. In mathematics this section deals with "**methods of optimization**" (googling it, to take in the library tutorial, etc.). The criterion "black" still have the right to choose. But this is a matter of technique. The task is not trivial, but solvable.

Only the word "identity" should be thrown out of the conditions. They don't make sense and make the task unsolvable. When the reduced image, first worked some kind of interpolation algorithm, and such algorithms with a dozen, maybe more, each will give a slightly different picture on the output. Then worked the compression algorithm with unknown parameters. I.e., varying the parameters from one original to make thousands (and even millions) of different little thumbnails. They are not completely identical (although many eyes are not distinguishable). You will need a lot of luck (not counting heaps of effort and heaps of time) to get to select all the parameters. So forget about "identity", we will be strict mathematical methods

We compare the screen of the same resolution, so either a large decrease or a small increase. And here it is not obvious that it is more efficient to compare a little (since we lose some information in exchange for a speed comparison and a more simple algorithm), it is worthy of a separate study, but for simplicity it will reduce the originals to compare only thumbnails.

So the comparison (of course, in a loop for all images). Pixel by pixel subtract one image from another. In the case of "full identity," the difference should appear as a fully black background. This we will not. But we have to look "dirty" the difference of the images (actually the "absolute difference", I only for ease of explanation say "the difference"). The less bright pixels in the difference, the more similar images. In mathematics this section deals with "

Find more questions by tags Images

2. The fact that the comparison is scans documents, so they will all be very similar, just the numbers may vary inside these documents) - clemens_OHara commented on March 23rd 20 at 19:35