Scoring

Score propagation

The idea of propagating scores from one node to another.

Truth scoring

Normative scoring

The problem of double scoring

Consider the following claims.
01 | Cars are going too fast
02 | 01 increases the risk of pedestrians getting hit
Assuming you’re scoring on an axis of good/bad, how would you handle these claims? You might want to score 01 as bad since speeding is dangerous. But 02 is more specific about what the danger is, so if you score 02 as bad as well, you’ll have scored the danger twice. The implications of this depend on whether the system uses score propagation. If it doesn’t, then you merely have a user experience that becomes tiresome when people find themselves scoring a chain of related claims all as bad for the same reason. If scores are propagated, then the problem of double scoring really takes shape.

If the bad score from 02 is propagated through to 01 and combined with the bad score that was given to 01, then 01 will end up being scored twice as badly as it should since both scores were responses to the same danger.

Only score leaf nodes

The problem with this is that you can’t express a sentiment about something without modelling it.

Score leaf + internal nodes with clever UI to disambiguate

When you go to score something in the interface, show a visual representation of all the chains of neighbours that are contributing score to it. Make all the intermediate nodes grayed out, except the leaves, which you can also score directly from this view. Finally, leave a place to score the node you’re actually viewing. It could be labelled “other” to indicate that the score you’re contributing isn’t captured by any of the existing chains.

This works almost perfectly except that whenever you add a new node that contributes score, you have to reset the existing “other” score because you don’t know whether the “other” score was set partly in reference to the thing which is now becoming explicit.

The awkwardness of scoring claims

There is a subtler problem underlying the above discussions. Scoring 02 | 01 increases the risk of pedestrians getting hit is actually slightly awkward; the part about 01 increasing the risk is beside the point. You want to say that pedestrians getting hit is bad. The relations of other things to pedestrians getting hit may be bad, but their badness is exclusively derived from the fact that pedestrians getting hit is bad. You have two approaches here.

@todo: introduce the idea of claim-embedding vs metadata here

Claim-embedded vs. metadata

The question of whether to model things as claim-embedded or as metadata arises in other parts of the system as well, e.g., truth scoring, relations, etc.

01 / Introduce fragments as first-class entities

You could permit a node such as 04 | Pedestrians getting hit which you could then score negatively.

02 / Embed scores in claims

You could model normative scores by embedding them in claims.
03 | Pedestrians getting hit is bad. This is nice because //

// Perhaps combine 03 with some kind of functional variable thing so it becomes some kind of variable-augmented claim.

However, it’s not immediately clear what to do with a claim like 03 — how to connect it to the rest of the graph in a useful way. There are two main approaches.

Stay claim-embedded

First, if you really want to keep everything claim-embedded, you could say 04 | 02 is bad and 05 | 03 supports 04. // Advantage of keeping everything explicit.

Introduce arguments as first-class entities

// Can you convert between claim-embedded and arguments? Can you convert between claim-embedded and other, non-claim-embedded elements? If the advantage of claim embedding is that it gives explicit handles, perhaps you only break things out into claims if they actually need to be debated.

Why scores must be embeddable in claims

As CDL points out, there are many examples of normative claims where the normative value is debatable. If normative scores existed entirely in metadata, there would be no way of debating whether a thing is good or bad — all you could do would be place your score. Therefore, even if some normative scores exist as metadata, there must be a way to handle normativity in a claim-embedded way as well.

Are truth scores somehow better suited to be metadata than normative ones?

The non-convergence problem of normative scoring

Related to why scores must be embeddable in claims is the observation that normative scores are fundamentally non-convergent. That’s what it means for them to be normative — they’re not based on an objective standard. Therefore, normative scoring systems might have what could be called the non-convergence problem: disagreements are erased at every level of the propagation processes. Truth scores don’t have this problem. If you take the average of two truth scores, you’re reasonably likely to end up closer to the truth. If you take the average of two normative scores, you’ve destroyed the evidence of a legitimate disagreement.

The problem of scale in normative scoring

Imagine the following claims.
03 | 01 increases the risk of insects getting hit
04 | 01 increases the risk of wildlife getting hit
05 | 01 increases the risk of pedestrians getting hit
06 | 01 increases the risk of a crowd of hockey fans getting hit
07 | 01 increases the risk of genocide
Of course, these are not all likely to be true, but if you’re scoring morality separate from truth then you have a problem.

You can’t score on an absolute scale

We don’t make moral evaluations on an absolute scale — they’re highly context-dependent.

Confine scores to isolated contexts

Score calibration

Only use comparative scores

The pitfalls of moral calculus

If all these claims coexist in a single system, you must have places where you can compare them. Find examples of the way that comparisons can, in pure utilitarianism, create unintended ethical artifacts.
Perhaps it would be illuminating to touch on animal vs. human rights or other sets of issues where one seems less important in comparison to the other and yet are both clearly important to do.
You need to address the idea that we should all only be focusing on the most severe ends of the moral spectrum and the critiques of well-intentioned projects that say “don’t we have more pressing matters to attend to in X other field?”

The poll layer

What if scoring was done according to a flexible, user-defined “poll layer” whose results are interpreted and made use of according in the local region of the graph and in ways that an LLM prescribes based on the nature or content of the poll. A variety of questions could be asked about different claims and users could respond according to the parameters of the poll. Polls themselves could be promoted or demoted in the UI according to how popular they are. The results of the polls could be displayed in ways that make sense according to the poll contents. The poll results could also influence the UI layer by modulating visibility scores or other things (see Relevance ranking). If you really struggle for engagement on this layer you could consider exchanging feedback for points.