Bug on Hivemind’s following data

Project Information

Problem

Hivemind backed api.steemit.com reports invalid/missing following data for some of the accounts. (In comparison to a full node)

How to reproduce

  1. Query the user curbot's following list. (condenser_api.get_following)
curl -s --data '{"jsonrpc":"2.0", "method":"condenser_api.get_following", "params":["curbot",null,"blog",100], "id":1}' https://api.steemit.com
  1. Do the same query on a full node: (https://rpc.usesteem.com)
curl -s --data '{"jsonrpc":"2.0", "method":"condenser_api.get_following", "params":["curbot",null,"blog",100], "id":1}' https://rpc.usesteem.com

You can see the response is different and incomplete in api.steemit.com..

A Python script the detect discrepancies

I believe this is not an exceptional case. I have seen more discrepancies like that while trying to test/benchmark the tower's new endpoints.

This Python script detects discrepancies on follower lists.

from steem import Steem
from steem.account import Account


def get_diff(account):

    followers_on_hivemind = Account(
        account,
        steemd_instance=Steem(
            nodes=["https://api.steemit.com"])
    ).get_followers()

    followers_on_full_node = Account(
        account,
        steemd_instance= Steem(
            nodes=["https://rpc.usesteem.com"])
    ).get_followers()

    print(
        "Accounts listed on api.steemit.com but not in the rpc.usesteem.com")
    print(set(followers_on_hivemind).difference(set(followers_on_full_node)))
    print("*" * 42)
    print(
        "Accounts listed on rpc.usesteem.com but not in the api.steemit.com")
    print(set(followers_on_full_node).difference(set(followers_on_hivemind)))


The result for @emrebeyler's followers:

Accounts listed on api.steemit.com but not in the rpc.usesteem.com
set()
******************************************
Accounts listed on rpc.usesteem.com but not in the api.steemit.com
{'hariyati.amin', 'curbot', 'kenzyobiadi', 'erhanbute'}

After some digging, I have found a rare case on a differently formatted custom json.

For example, I have checked the account history of curbot that when he exactly followed my account, and found this transaction:

Transaction ID: aaccccb73b6dfcb4bbf95f6d2dcb76e1c87137e9

Looks like curbot was bundling follow operations into one transaction. And steemd picked up these and registered as valid follow actions.

However, hive's indexer ignores the custom_json op if loaded json's length is greater than 2.

https://github.com/steemit/hivemind/blob/f7a467921678d928a0d94928c811442b8ab80bce/hive/indexer/custom_op.py#L55

For this case it's greater than 2 because the format is like:

[
    ['follow', {
        'follower': 'curbot',
        'following': 'kevinwong',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'nothingismagick',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'simnrodrguez',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'steem-ua',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'decentraland',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'mikepm74',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'empath',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'emrebeyler',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'eroche',
        'what': ['blog']
    }],
    ['follow', {
        'follower': 'curbot',
        'following': 'ervinneb',
        'what': ['blog']
    }]
]

This explains curbot.

Regarding my other 3 missing followers:

FollowerFollowingTx IDBlock numTimestamp
erhanbuteemrebeylerd10dcd1bdb661fc4e63f2464fa2262624db5d003267109862018-10-11T09:55:21
kenzyobiadiemrebeyler9ef235eb36aac5e466b97ad3e459b7eb9495f898264923932018-10-03T19:38:45
hariyati.aminemrebeyler383a36f7aa65724eb634ebdae141366674dc1df8264504692018-10-02T08:41:33

Timestamps suggest that it happened between 2018-10-02 a 2018-10-10. These transactions don't involve anything unusual.

Additionaly, I have checked roadscape's followers on Steem:

Got this discrepancies:

{'curbot', 'kamvreto', 'msutyler'}

We know the problem w/ curbot so I have checked the other accounts.

For the kamvreto, they followed roadscape at 2016-07-25T22:35:12.

Here is the account history output:

{
    'trx_id': '2b7595b1f3e0e0105156d518b83d7eeaa19b6070',
    'block': 3514062,
    'trx_in_block': 3,
    'op_in_trx': 0,
    'virtual_op': 0,
    'timestamp': '2016-07-25T22:35:12',
    'op': ['custom_json', {
        'required_auths': [],
        'required_posting_auths': ['kamvreto'],
        'id': 'follow',
        'json': '{"follower":"kamvreto","following":"roadscape","what":["posts","blog"]}'
    }]
}

It was a legacy custom_json transaction. The tricky part is that transaction's what property includes two elements.

You can see the Follow constructor expects one element:

https://github.com/steemit/hivemind/blob/60dc61ee4bbde2080421a3fdf10c5b83be840e8b/hive/indexer/follow.py#L71
For this reason, Hive also ignores that.

The problem is same with the other missing follower of roadscape:

{
    'trx_id': 'c7694ff17ba7ba3fbe1740f05c2727ecbd98cd62',
    'block': 3409232,
    'trx_in_block': 1,
    'op_in_trx': 0,
    'virtual_op': 0,
    'timestamp': '2016-07-22T06:18:27',
    'op': ['custom_json', {
        'required_auths': [],
        'required_posting_auths': ['msutyler'],
        'id': 'follow',
        'json': '{"follower":"msutyler","following":"roadscape","what":["posts","blog"]}'
    }]
}

Expanding the sample size:

Discrepancies on @utopian-io's followers:

Accounts listed on rpc.usesteem.com but not in the api.steemit.com
{'qawazd', 'steemgems', 'curbot'}

FollowerFollowingTx IDBlock numTimestamp
steemgemsutopian-io25e9c3d8e625e634b68bd5e16e99327fd37174ae267223682018-10-11T19:25:27
qawazdutopian-io8de43899a8ad84b8bd65a896e71e3e0eafda0757268389412018-10-15T20:37:51

Follow operations are valid. Dates are close to what we miss at @emrebeyler's account: 2018-10-11 and 2018-10-15.

TL;DR

  • We have missing follow ops on api.steemit.com's hive instance. (Generally clustered around the month 2018-10.)

  • Hive ignores if the follow operation includes multiple follows. (steemd accepts it. The case with the @curbot)

  • Hive ignores some legacy follow operations. Because, these ops may include two elements in the what property. (Ex: ["posts", "blog"])

My GitHub Account

https://github.com/emre

H2
H3
H4
Upload from PC
Video gallery
3 columns
2 columns
1 column
16 Comments