• To log on and post you will need to create an account in the forums even if you already registered on the main Hotness Rater site. These registrations are separate.

Rating changes

HotnessRater

Administrator
Staff member
I’ve been watching rating functioning for over a month, now, and I have some observations and suggestions.

First, the picture size is too small. Sometimes it is a large pic shrunk down, but sometimes it is a small pic, and there is no way to tell. When you look at the babe after you make your rating, you find this small pic that loses resolution when it is expanded to fit the screen. There have been any numbe of pics I would have rated differently if I had known. After all, that is what we are rating, the picture.
That is by design. We don't want you voting on the picture size. We want you voting on the contents of the picture. Picture size is subject to change with new uploads and the new picture keeps the vote history

Otherwise every pic of a babe would be rated the same.
I completely disagree everyone has good and bad pictures, especially as people age and have a lifetimes work represented

Second, while the battle of the day is many votes between two babes, the individual ratings are not, and lead to some screwy results. After rating a pair of pics, I look at each to see the wins and losses, and enough of them are affected either by favoritism or possibly self-interest to skew the results. Is it possible to be given your own babes to rate, or is that source of bias weeded out?
We balance our queue very carefully. It would be extremely hard to skew the results. We limit how many times you are given both a picture and a person and we also give consideration to who uploaded the picture.

Anyway, I don’t think any matchup should be decided on single votes. Once a matchup gets created, it should be handed out to others periodically until at least 20 votes have been received. This will keep great pics from being downgraded and junk photos getting good ratings because of flaky voters.
No match up is decided by a single vote. The whole system runs on the number of wins and losses vs people in their range. A single vote has little or often no effect on the rating.

How does the battle of the day affect picture ratings? Is it a pic vs pic thing, or a babe vs babe thing? If the latter, it would be helpful if several pics for each babe were chosen.
Battles of the day don't affect rating at all. Those votes are not part of the regular voting system. We run the rating system on each picture being rated against a wide range of pictures, the more the better. Including battle of the day votes would destroy that.

How are the top 100 pics chosen? I see a lot of pics in there that are simply not good enough. Yesterday I saw one of just an ass (which bemuses me because I own it). No pic should make the top 100 if you can’t identify the babe. Also, it should require a minimum number of votes, like 500 or more. There are lots of iconic photos on this site that are far far better than a lot of what is in the top 100. Some of them have had their ratings distorted by the voting problem I mention above.
Well we have 2 lists, the top pictures and the top women. If we limited it to 500 votes, only 0.11% of our pictures would be eligible (and this would be your #1 picture: https://hotnessrater.com/picture/5467187/emmanuelle-chriqui) Personally I think our #1 picture right now is better

We do limit it to having had 15 votes against different high rated pictures. We have played with this number on and off. It seems that any bigger than that, we have pictures staying on top for weeks or months which is boring for our users. Less than that and you get a higher frequency of pictures that hit number one that shouldn't be there.

You mentioned an ass pic that made it... When you are the #1 picture, you show up more frequently in voting. You will notice that that picture was no there long.

There should be other requirements: the photos should have a minimum number of ratings against other top 100 photos, to make sure they are ranked correctly.
We do that. Votes are only counted if their competitor's picture falls within their same range.

Also, they should be ranked against other shots of the same babe, to make sure the best photo of the babe is the one that makes the list.
We don't rank pictures of any girl against herself and we aren't going to start doing that. We don't want pictures falling down in ranks because they lost to the same girl who happens to have a lot of great pictures.

I think these criteria will provide a list that more people would understand and agree with.
I think we already follow many of the criteria that you mentioned (unless there was reason not to). I've considered all these things and implemented some and backed out others. I have been adjusting the algorithm for over 7 years

If you think any of this may be sour grapes because I’m new here, it isn’t, I have 15-20 of the top 100 spots. Many of my photos in the top 100 I don’t think should be there. I do have a few others that are great photos but aren’t on the list, and my reaction is WTF? Why these and not them? The answer is the automated rating system and how it functions.
Or maybe your taste just doesn't define what everyone thinks. Yeah its not an exact science. With hundreds of thousands of pictures, some are bound to creep up in rankings and some are bound to be rated lower than they should be. Overrated pictures don't last at the top long. I don't see the butt shot that you mentioned in the top 100

Go look at the Maxim Hottest 100 that are manually picked and then come back and tell me that our software does a worse job than that.

I don't always agree with the results either but the general public is voting. Some votes are valid and some are not. Overall most are honest. We take that into account. If you vote a 7 above a 9 and then the next votes prove that the 7 is actually a 7, that vote is no longer included in either pictures calculation. You have to have a certain amount of votes against people in the same range to be included in that range.

Show me a better, more accurate top 100 list.... I dare you
 

HotnessRater

Administrator
Staff member
But we are voting on the picture as it stands, not some mythical picture as it may be in the future. We should see its size. The pictures as displayed are too small. A large out of focus photo will look much sharper in detail when it is shrunk down, because of the data lost in shrinkage, and can look just as good as a sharp, shrunken photo. We cannot judge accurately as things stand. Trying to provide parity for small images by shrinking big ones down to their size results in a loss of resolution in larger (and very likely better) images. It skews the results against large images, because we cannot see their full quality.
We have to show both images side by side so we can't make them much bigger. Again, this was intentional to size the images down to fit on that screen. We fit them to the size of the screen. I don't know what else you want me to do. If you want to see bigger images get a larger screen. I do not want image size to come into the equation when voting.

So, hypothetically, if I voted enough to reach these limits for each babe, I couldn't get any more pictures to vote on?
No you can vote on them once in X amount of votes. Our picture number is bigger than X so there is no limit.

Isn't each matchup a single vote? Are you talking about the BotD? I'm talking about the votes that go into the rating, and each one certainly is a single vote.
No I'm not talking battle of the day and yes each matchup is a single vote


So the one pic vs pic matchup that includes multiple votes (and is therefore more reliable) isn't even included in a babe's rating? Naturally, you wouldn't include each individual vote, but the outcome of the matchup would be far more reliable than the individual votes that you do use. I'm saying that each pic vs pic match should have a minimum of 20 votes before it could be used in determining babe rating.
We don't have the concept of a "matchup" there is just votes. And if we did what you are suggesting, we would have 1/20th of the amount of "votes" or "matchups" to make a judgement and nobody would be rated.

Which really helps make my points. The 9.9 rating on that pic isn't an accurate measure of its quality. Individual votes against other pics are given too much weight in deciding pic vs pic matchups. Many of the pic's "wins" would become losses if a 20 vote minimum were required for the matchup to count as a win or loss.
We don't have anywhere near the number of votes to do what you are suggesting.

Then there is the distribution problem. In a one vote match, you would expect some bias, and so would need to use a probability curve to decide where in the range the pic actually falls. A 9.9 pic could be expected to defeat some higher ranking pics (and hence its true rank would be somewhere below the top pics it defeated). Also, many of the pics it lost to are lower ranking, and better pics, why weren't these given more weight? Winning vs lower ranked pics should count for less than losing to lower ranked pics, where if the ranking was correct it would have won?
Winning vs lower ranked pics does count for less (or nothing at all if there is a big enough differential)

I propose two challenges to demonstrate this point:

Have a faceoff day, put up battles for the Chriqui pic against her top 20 wins, each to be decided by a minimum of 20 votes.
Chriqui is down to 13 already but she has won against a lot of good pictures. That isn't coincidence. A lot of people like it. It is possible for a not quite as good picture to get high in the rankings but if it doesn't measure up, it will drop pretty fast as more match ups (votes) are given to it

Also, put up 20 battles vs lower ranked pics I choose by hand, choices to be significantly lower in rank and similarly clothed (ie no skimpy bikini shots vs her pic in a dress). This will demonstrate the inaccuracy of her rating, and the ratings on the other pics I choose.

I expect this will give you hard evidence of how well your system is performing. If her pic gets slaughtered (like losing 15 or more matches in each of the two challenges), then it is time to make some changes to improve the accuracy of your system.

I do hope your goal is to have the best rating system on the web.
The problem with your match up system is this: If a picture gets 100 people to vote on it, it has only been compared to 5 pictures. That doesn't even give us a good idea where she should be rated. In our system, 100 separate votes against an entire field of pictures gives us a really good idea. In fact 20 gives us a pretty good idea. With 20 votes from a single match up with your system, you would only have a very vague idea where to put them. You would need hundreds of individual votes to rank someone. There might be some votes in that which are biased but each one only counts for a small fraction of her rating... and if that picture that won and shouldn't have won really is a worse picture, its rating will fall and it will be no longer considered in the calculation of the better picture that should have won.

Let's put it this way. We have 1,207,719 active pictures on hotnessrater. We have had 19,798,639 votes over the years. That would mean each picture has an average of 16.39 votes. I think you can see the problem with that.
 

HotnessRater

Administrator
Staff member
It is gone now. I don't remember whose ass it was, even though I bought it. I figured it would get hits. It was scored 60ish on the list yesterday. I found 6 or 7 pics today that don't belong there though. Either there is a coding bug or the pics aren't flagged right when they enter your system.
I don't see any that don't belong in there, can you elaborate?

Then you can only determine the relative ranking of a babe's pics indirectly. It makes it harder to get the girl's top pic on the list. I think that your concern of a girl hurting her own ranking is larger than warranted. With all the pics on your system, how often would such a matchup occur right now, totally at random? And wouldn't such a rating be valid in ranking one photo over the other?
We don't put a girl in a position where one of her pictures has to lose. That would just hurt the persons rating overall. Yes these pictures are compared directly. I don't see a problem with that.

Well, I'm sure my tastes differ from the average adolescent male. Being a photographer, I put a higher emphasis on image quality, resolution, lighting, focus, depth of field, etc. than most people. I also rate on setting, composition, clothing, and style.
Yeah, its not about voting on which picture has better lighting :)

Honest isn't the same thing as unbiased. You minimize the effects of individual bias by group confirmation (requiring multiple votes for a pic vs pic matchup). Of course, you still have the problem of group bias, but that's a social issue too big for a web site like this.
Of course there is group bias. That is what it is all about. Finding how people would rate these girls is pretty much 100% asking them for their biased opinion. If it wasn't biased, everyone would be a 10 (including Rosie O'Donnell - who we have identified as a 4.94 which to me seems pretty accurate https://hotnessrater.com/person/26787/rosie-odonnell)

Individual bias gets minimized with having a lot of different voters and yes pictures often get the same match up with different voters. I just don't force matchups like you suggested since it just wouldn't work. I didn't just start this yesterday. I have balanced and changed the algorithm for years based on the results. There is a margin of error of course and things continually balance out as more votes come in. Newer pictures can jump around in ratings until things settle down for them but overall I am pretty happy with our results.
 

HotnessRater

Administrator
Staff member
Here Are some examples of blurry photos that are probably ranked higher than they deserve because during rating the images are shrunk to a point you can't tell:

https://hotnessrater.com/full-sized-picture/2065184/audrey-allen
https://hotnessrater.com/full-sized-picture/3590382/jessica-ashley
They are voting on the how hot the girl is in the picture, not the picture size, lighting or whether it is blurry. Obviously most people don't think that is important which is why they are rated highly. Just because you think those are important qualities in these pictures doesn't mean everyone else does.

The picture that is displayed in the battle this one: https://img1.hotnessrater.com/2065184/audrey-allen-lingerie.jpg?w=600&h=900

That is almost the full sized image. It is definitely big enough to tell if it is blurry. The pictures displayed in the battle are 40 pixels smaller than the original picture in both these cases. They are about 7 inches wide on my monitor
 

HotnessRater

Administrator
Staff member
So this picture:

https://hotnessrater.com/full-sized-picture/3956645/rosie-jones

defeated:

https://hotnessrater.com/full-sized-picture/1515407/hannah-ferguson

Only in a one-vote biased matchup, where pics are shrunk so you can't see their true image quality. With 20+ votes on full-sized images it wouldn't happen. These sorts of outcomes are strongly skewing your ratings.

The Rosie Jones pic also defeated:
https://hotnessrater.com/full-sized-picture/3898847/kara-del-toro
https://hotnessrater.com/full-sized-picture/3793702/lais-ribeiro
https://hotnessrater.com/full-sized-picture/593479/nina-agdal


Well, some of those might end up around 50/50. But what about:
https://hotnessrater.com/full-sized-picture/5509572/stacy-keibler
https://hotnessrater.com/full-sized-picture/1444547/whitney-cowart
https://hotnessrater.com/full-sized-picture/3895905/miranda-janine
https://hotnessrater.com/full-sized-picture/2919181/karen-lima
(Crop white blocks from this last pic so it can be displayed fully)

I think in full-sized matchups with multiple voters, the Jones pic loses most or all of those "wins".

There is another question: How good are the pictures it beat that are judged to be in its range? What is that range, btw?

Here are some pics the Jones image won against that I don't think deserve their ratings, and so shouldn't be used to boost the rating of the Jones pic:

https://hotnessrater.com/full-sized-picture/1948964/jessica-alba
(I love Jess, have loads of her pics on this site, and over 5,000 on my hard drive, but this one kinda blows, 9.67? Really?)
https://hotnessrater.com/full-sized-picture/1948964/jessica-alba 9.71???
https://hotnessrater.com/full-sized-picture/4499525/catherine-zeta-jones 9.75, crop of another pic, and shouldn't even be on the site
https://hotnessrater.com/full-sized-picture/3853166/noemi-olah 9.63 again


BTW, are ratings recalculated if a pic the ratings are based on is removed?
So you picture one Rosie Jones picture that was showing partial nipple as your argument. That picture shouldn't have even been on HotnessRater.

And if you had a look at the pictures that Rosie lost against, there were some pretty bad ones in there too. It balances out with enough votes. I'm not sure why we are even talking about this though since I have shown it isn't possible to give 20 vote match ups between pictures. Nothing would get rated. Besides, our system gets more accurate with more votes. If we had 20 times the votes that we do now (which is what we would need to make your system work), our accuracy rate would go up without changing our logic.

The Jessica Alba picture you put up in question has 18 total votes. With your system, she would still be working on her first match up. Watch that picture as it gets more votes and see what happens.

https://hotnessrater.com/full-sized-picture/4499525/catherine-zeta-jones 9.75, crop of another pic, and shouldn't even be on the site
Well we don't have a hard rule about cropped pictures but I have no way of knowing if that was cropped or not. I'm not sure what you are talking about here.

Really a picture with 8 total votes?

And another picture with 8 total votes.

So show me a picture that you think is rated wrong that has 200+ votes because that is what it would take to get a rating with the system you described.
 

HotnessRater

Administrator
Staff member
RE the cropped picture, my point is your moderators/uploadrs should not have allowed/put the photo on your system to begin with..
Are you talking about cropped pictures or pictures with borders / white space around them? I have no way of knowing if it a cropped picture. As far as pictures with white space around them, its a judgement call based on how much white space and how good the picture is.

I came across a girl yesterday whose hotnessrater material was over 3000 boring catalog shots. Eliminate them.
Says you. A fan of that girl might like to have access to a lot of pictures of her. If a user took the time to upload them, why would I eliminate them? We limit the number of times a girl shows up in the voting. If you looked at her, you would find most of those pictures probably have no votes. It doesn't hurt anyone to have them available.

It would help reduce individual bias distorting your results.
I disagree that there is an "individual bias" distorting our results. I am pretty happy with our results

As for showing you a pic with 200+ votes that is rated wrong, look at the Chriqui pic. I'd really like to see the results of the two challenges I proposed, I'm about half done choosing my contenders.
I looked back through this thread and couldn't find where you proposed any challenges for that picture. What did you have in mind?
 

HotnessRater

Administrator
Staff member
Well, I don't see anything obvious about it. Since people are prevented from voting on photos in their true size, you have no actual data, just your opinion on why people vote how they do. When it comes to voting on different pictures, there is no valid stastical method to differentiate their reasons.
Well since they don't know the size of the picture, I think it is pretty obvious they aren't voting on the picture size. That is by design.

The idea that size, focus, lighting, and resolution make no difference to people and how they vote, especially on photos of similar hotness, goes against common sense, and feedback that pretty much every web site gets from its customers bears out that it matters. The fact that you try to normalize it away by shrinking great photos down to match inferior photos so the inferior photos have a better chance is acknowledgement on your part that what I say is true.
This isn't a photography site. We aren't voting on picture quality or resolution. We are voting on the hotness of the girl in the picture. The pictures are 600 pixels wide on the voting page. They aren't exactly tiny thumbnails. This is more than adequate to get an idea of how hot the girl in the picture is.

So, I took a SS of the BotD, saved off the images as presented on the BotD screen, and inspected them. I zoomed in my ipad screen to get the largest version of the image that I could, and took SS of them as well. I compared those SS the images available after voting, when you can look at the real images, and the difference is noteworthy. Trying to zoom up the shrunk image results in a crappy image with lots of digital artfacts.
You are misunderstanding the whole concept of the site.
 

HotnessRater

Administrator
Staff member
I don't even know how you would propose to do a "full sized" voting screen. There is only so much screen space. When we display 2 pictures, it takes up the majority of the screen. We have a column to show the history on the left but we tried to make it minimal. Other than that, you are limited to the resolution of your screen. At the time we display the images, the javascript tells us the resolution of your screen and we serve back images that will fit the div.

In my case I got this image back https://img7.hotnessrater.com/1758/heidi-montag.jpg?w=700&h=1050

It allows for 700 x 1050px. If I had an ultra high resolution, it would have been bigger. If I was on a phone it would have been smaller. I'm not going to return a 2000px wide image to a 500px wide screen and waste everyone's bandwidth.

Look at these screen shots, there isn't must else we can fit and still show 2 images on the screen. We want the user to see them side by side and not to have to scroll back and forth

hr1.jpg


Some pictures will take the full height we give and some will take the full width:

hr2.jpg


And if I zoom out to 25% the screen looks like this.... You will notice how tiny everything is but we still serve images to fill the space. Most aren't big enough to fill the space but the one on the right was. It served up a 1995 x 3000px picture to fill the space.

I don't know how much more "full sized" you want... we just make sure it fits your screen

h3.jpg
 

HotnessRater

Administrator
Staff member
Add links to the page so people can pop up the full pic if they want to see it before voting, that's all. You don't need to revamp the whole screen. As you pointed out, there is only so much room to display them side by side, a very valid point. So is avoiding scrolling around.

Since clicking on the pic does nothing until you have voted, you could use that to give you a full size popup. I rather expected it to work that way the first time I voted on a BotD, I was a bit surprised when it didn't do that in the first place.
I originally avoided linking the picture to the pictures page just so the voter couldn't see what the picture was rated at. Changing the link so it just shows the full sized picture (and then click it again to dismiss it) is a great idea though. I will definitely implement that

... and that is pretty easy to do. It will likely be my next change
 

HotnessRater

Administrator
Staff member
Bottom 5 paragraphs of post 3 in this thread, I proposed two challenges, one between the Chriqui pic and her top 20 victories, and one between 20 pics chosen by me, pics to be similarly clothed and lower ratings (like in the 9.0-9.7 area). Hmmm. They should be pics not in her list of existing matchups, too, but that is a lot harder for me to manage, paging through her wins and losses takes a lot of time.
That picture won 663 battles from 77 different users - the person that uploaded her did not vote on her. The person that owns her did not vote for her.
That picture lost 66 battles total from 35 different users

18 of the people that voted for her also voted against her.

She beat this picture 4 times from 4 different users without losing to her. So in your best of 5 scenario she would have won that one.

https://hotnessrater.com/picture/2824555/pauline-jackson

Just because you are in a bikini and show your ass doesn't mean people

That list we show of wins and loses doesn't mean they just won or lost to them once. We group them so they only show up once in the list

Other pictures that she won multiple times against include:

4 times:
https://hotnessrater.com/picture/1345253/katee-owen

3times:

https://hotnessrater.com/picture/4916401/amberleigh-west
https://hotnessrater.com/picture/1240088/josie-mont
https://hotnessrater.com/picture/1442527/jenna-renee-webb
https://hotnessrater.com/picture/4436282/jami-ferrell
https://hotnessrater.com/picture/3915691/luisana-lopilato
https://hotnessrater.com/picture/52643/emma-kuziara
https://hotnessrater.com/picture/1263290/gabriela-salles

2 times:

5452641
3958153
5521990
1445304
168437
3870156
4025606
3100485
3872082

I don't think this is just a case of the algorithm not working. She's a pretty girl in a classy dress with a good side boob shot. She's sexy without having to get into lingerie to show it off. I think people just appreciate that.

Granted with 663 battles and 77 voters, there is potential that there are a couple people that just vote a lot and really like her... but we would run into that problem if we demanded 5 or 20 votes on every match up too.

There are other ways we can limit users from voting on the same picture or girl more often to limit that and maybe that is something we can do. I don't think the answer is to throw away our whole system and start from scratch.
 

HotnessRater

Administrator
Staff member
And I do think that the limited number of voters has skewed her rating, the pic is not a 9.9, 9.5 maybe.
And you can have that opinion. Others on the site disagreed with you and you are outvoted. That is how voting works... nobody said you would agree with the outcomes.

General statistical principles are still the soundest way to go, though. More votes per matchup equals greater reliability.
No, more votes in general is going to get you just as much accuracy. In sports, they look at the overall win/loss for a team to decide their overall rankings. They don't group all matchups between each 2 teams, decide which team is better in each match up and then rank them based on that. Yet, the overall win/loss is very telling and gives an accurate view of where they are in the ranking. If they are equally matched there ends up being some games won and some lost but they all count towards their ranking. This is how we do it. I don't need to define a definitive winner between two pictures to build a relative ranking. What you are suggesting does NOT create more accuracy, it just requires more votes. In fact, I could argue that being compared to a wide range of pictures all of which have been compared to each other creates more accuracy than limiting the picture to less match ups.
 

HotnessRater

Administrator
Staff member
Actually, in college football coach's poll and things like that they do consider who you played and what the score was.
Congratulations, you found the one anomaly in sports that you can use to be argumentative. Of course there are other ways to rank and rate things. Does the NFL use the coach's poll to decide who will play in the SuperBowl? How are the teams that enter the playoffs in any major sport decided?

Coach's polls are also generally in agreement with the regular polls anyways, so what does it accomplish to switch it over? (Other than requiring more votes than we could possibly get, creating a ton of work for me and overnight changing most of our rankings to "unranked" since most of them wouldn't have finished a single battle)
 

HotnessRater

Administrator
Staff member
Well, as I said before, I'd institute gradual changes.
As I said before, I wouldn't

To start, I'd eliminate shlock pics that 98% of your raters would prefer not to use their rating time on, and 98% of your viewers don't care if they ever get seen or rated
Users can set the parameters for the pictures they want to rate. I'm not removing pictures just because they rate low. The premise of the site isn't to just rate hot chicks. It is to rate anyone... yes men and ugly women too.

At the same time, I'd raise the rating standard to best 2 of 3 (so two votes would be enough if they are in agreement)
I know you would... and I wouldn't. We have been over this time and time again. It isn't going to happen.

Once you have a large enough base of 3 vote ratings, then raise the standard to 5
We never would... we still have a large number of pictures with no votes at all.

I'd give points to players for doing rating, say, 1 point per 10 ratings.
I thought about that but it would just result in people voting as fast as they could and not even looking at the pictures.

and a lot of the coding is pretty straightforward
You have no idea if it would be straightforward or not. I don't have the concept of a match up that can get multiple votes. There would have to be new tables involved and my ratings calculations would completely have to be revamped. You don't know what we have to do now to come up with a rating, so you can't say whether or not this change is straightforward.
 

HotnessRater

Administrator
Staff member
Let's say I have 10 pictures and 50 potential votes. The lower the picture number the hotter the picture. So Pic1 is the hottest Pic10 is the least hottest.
We will assume a 10 vote match eliminates all error.

I could structure the votes like this:

Pic1 vs Pic2 - 10 votes = Pic1 wins
Pic3 vs Pic4 - 10 votes = Pic3 wins
Pic5 vs Pic6 - 10 votes = Pic5 wins
Pic7 vs Pic8 - 10 votes = Pic7 wins
Pic9 vs Pic10 - 10 votes = Pic9 wins

or I can give each picture 5 votes against random pictures and get

Pic1 is 5-0
Pic2 is 4-1
Pic3 is 4-1
Pic4 is 3-2
Pic5 is 2-3
Pic6 is 2-3
Pic7 is 2-3
Pic8 is 2-3
Pic9 is 1-4
Pic10 is 0-5

With the top method, you have no idea of the hotness relevance between Pic1, Pic3, Pic5, Pic7 or Pic9. You can really only give 2 number ratings, half are a 1 and half are 10s

The bottom method (which I use) gives you a pretty good idea of relative rating after the 50 votes. It's not perfect but I can already hand out 6 different rating levels opposed to the other method that can only give out 2.

Both methods aren't entirely reliable and will of course benefit from more votes

So lets say we get 50 more votes. Now with your method you add
Pic1 vs Pic3 = Pic1 wins
Pic5 vs Pic7 = Pic5 wins
Pic9 vs Pic2 = Pic2 wins
Pic4 vs Pic6 = Pic4 wins
Pic8 vs Pic10 = Pic8 wins

Now we have
Pic1 with 2 wins
Pic2 with 1 win
Pic3 with 1 win
Pic4 with 0 wins
Pic5 with 2 wins
Pic6 with 0 wins
Pic7 with 1 wins
Pic8 with 1 win
Pic9 with 1 win
Pic10 with 0 wins

So which one is hotter Pic5 or Pic6. How about Pic4 or Pic5? Pic6 or Pic10?

You really still don't have any idea how to rank these after 100 votes. You're not really even sure if Pic1 is better than Pic5.

With my method, there might be some bad votes but you even with that calculated in, you can get a better idea of relative ratings and my way gets more accurate faster as votes are added... especially vs large sets of pictures.
 

Attachments

Let's say I have 10 pictures and 50 potential votes. The lower the picture number the hotter the picture. So Pic1 is the hottest Pic10 is the least hottest.
We will assume a 10 vote match eliminates all error.

I could structure the votes like this:

Pic1 vs Pic2 - 10 votes = Pic1 wins
Pic3 vs Pic4 - 10 votes = Pic3 wins
Pic5 vs Pic6 - 10 votes = Pic5 wins
Pic7 vs Pic8 - 10 votes = Pic7 wins
Pic9 vs Pic10 - 10 votes = Pic9 wins

or I can give each picture 5 votes against random pictures and get

Pic1 is 5-0
Pic2 is 4-1
Pic3 is 4-1
Pic4 is 3-2
Pic5 is 2-3
Pic6 is 2-3
Pic7 is 2-3
Pic8 is 2-3
Pic9 is 1-4
Pic10 is 0-5

With the top method, you have no idea of the hotness relevance between Pic1, Pic3, Pic5, Pic7 or Pic9. You can really only give 2 number ratings, half are a 1 and half are 10s

The bottom method (which I use) gives you a pretty good idea of relative rating after the 50 votes. It's not perfect but I can already hand out 6 different rating levels opposed to the other method that can only give out 2.

Both methods aren't entirely reliable and will of course benefit from more votes

So lets say we get 50 more votes. Now with your method you add
Pic1 vs Pic3 = Pic1 wins
Pic5 vs Pic7 = Pic5 wins
Pic9 vs Pic2 = Pic2 wins
Pic4 vs Pic6 = Pic4 wins
Pic8 vs Pic10 = Pic8 wins

Now we have
Pic1 with 2 wins
Pic2 with 1 win
Pic3 with 1 win
Pic4 with 0 wins
Pic5 with 2 wins
Pic6 with 0 wins
Pic7 with 1 wins
Pic8 with 1 win
Pic9 with 1 win
Pic10 with 0 wins

So which one is hotter Pic5 or Pic6. How about Pic4 or Pic5? Pic6 or Pic10?

You really still don't have any idea how to rank these after 100 votes. You're not really even sure if Pic1 is better than Pic5.

With my method, there might be some bad votes but you even with that calculated in, you can get a better idea of relative ratings and my way gets more accurate faster as votes are added... especially vs large sets of pictures.

Well, instead of the above, use 5 vote matches in this hypothetical example. It isn't going to be perfect, but it will be stronger than single vote matches, the study of probability and statistics makes this clear.

Second, after the 1-2 match, do 1-3 and 2-4. Now, with 15 votes you have some relationship between 4 pictures. Don't create new matches between unrated images, as they have no relationship to your ranking structure. Even with an idea of how the two compare, the pair could slide up or down your scale because they have no connection to it. Each time you add a new pic to your rating tree, the new pic gets 5 votes, while raising the total votes of a pic in the tree, and more match data to raise the confidence level in its placement. After 45 votes, every pic is connected to the tree at sme point. That leaves you 55 votes to assign matches to clarify anything that needs it.

Note that this goes unseen by the person sitting at the computer doing ratings, they are just voting on various individual matches that are in progress, without being aware of the context of their votes. From their point of view it looks random, but really it is not. The system assigns votes where in needs matches clarified to increase matchup confidence.

Now you start confirmation ratings. If a vote was close, add a couple more votes to it. If a pic beat two other pics, now you rate those two against each other. You won't cover everything, without enough votes, but you work to increase your confidence in matchups. If a vote was 4-1 or 5-0 your confidence is pretty high on the relationship between the two pics, so work on raising confidence elsewhere. Once your starting pics are sorted in the tree, apply additional votes to adding new pics to the tree, or increasing confidence.

If pic A > B, and B > C, then it is safe to assume A > C. By the time you have 10 pics in the ranks, you can start assigning numbers.

Of course, there are ways to stretch your vote resources. If you are starting with best of 5 matches, you can stop votes temporarily if the score is 3-0. Apply the 2 saved votes elsewhere. You can always add more votes later, which will either confirm, raising confidence, or contradict, lowering it. If confidence is lowered, apply more votes. If a particular pair never resolve, treat them as a statistical tie at a programmed decision point based on your current average number of ratings per matchup. These ties aren't a problem, they will eventually be grouped into a rating pool.

I'd start by running an analysis on your existing data, and see how many votes each PVP matchup has. If you have 100,000 or so pics with 5+ vote matchups already, that's a pretty good starting tree. If it is more like 20,000 then you need to make a choice, go with 5 vote matchups to start, and prioritize attaching pics with fewer, or go with 3 vote matchups for a larger starting tree.

I see a lot of pics with hundreds of votes. The most I can recall seeing was in excess of 12,000. No doubt there are others with more. The relationship between these pics would establish the main order of your tree. Really, it is more like a train than a tree. With a tree, you cannot tell which limb is greater, so it would be important to schedule votes to make this determination, so you link the pics in line, like train cars. When two long lines of pics are in parallel, it may take votes all along the line to merge them.

At a certain point you start establishing pools of basically equal pics at your score points. With two decimal places, and people not wanting to look at pics below 8 much, you have about 200 rating locations: each .01 increment from 8 to 10. With three decimal places you have 2000 pools of pics. Your pics in your starting tree get distributed into your pool points. The first priority for new pics is to get placed into a pool. You do this with a binary sort algorithm of matches up and down the list of pools. With 15 to 20 matches against well-established benchmarks, plus other random pics in different pools, a starting pool is chosen. To climb the rankings, a pic has to defeat a strong majority of other pics in its pool.

If the binary sort is solid, the pic won't move far from its initial placement, although it should continue to get matchups from time to time to increase the confidence on its placement.

Say there are 300 pics at 9.219. To have confidence that a pic is better than the pool it is in, it needs to win a certain % of matches, both in its pool and around it. Then it moves up. It's like a train car getting moved up in the line. If it does really well, it might jump a few pools.

Note that with this system, it would be very rare for a pic to end up a 10.0 with 15 individual votes, like happens comparatively frequently atm. Currently, pics with low votes are often overrated, and then slowly drop.

There are two problems with this. First, other pics are having their own ratings inflated by defeating these overranked pictures, perpetuating overrating as new pics come in.

The second problem is, for reasons I don't understand, a lot of pics seem to stop getting rating votes. I frequently see pics with very low ID numbers that have very few votes. Presumably, these low-numbered pics are the earliest pics on the server, and having been around the longest they should have more votes than newer pics, at least as a general tendency. However, it often seems to me that this isn't the case, although my data points are limited to the number of pics I have seen (4400 purchases, 500+ auctions, and pics I didn't buy because of cost or quality, maybe another 15-20,000?)

Anyway, a vote stoppage will freeze the overrankings I was talking about above in place, exacerbating the problem.
 

HotnessRater

Administrator
Staff member
Well, instead of the above, use 5 vote matches in this hypothetical example. It isn't going to be perfect, but it will be stronger than single vote matches, the study of probability and statistics makes this clear.
I disagree, it has the same problem. (just not as bad). In fact as you increase the match size, the problem gets worse. The ideal match size is 1
 
Last edited:
Top