我使用的是Python 3,并且尝试了以下方法:
data = pd.read_json('file.json',encoding="utf-8",orient='records',lines=True)
但它返回了以下错误:
ValueError: Expected object or value
这是JSON文件的结构,仅供参考:
{ "_id" : ObjectId("5af1b1fd4f4733eacf11dba9"), "centralPath" : "XXX2", "viewStats" : [ { "totalViews" : NumberInt(3642), "totalSheets" : NumberInt(393), "totalSchedules" : NumberInt(427), "viewsOnSheet" : NumberInt(1949), "viewsOnSheetWithTemplate" : NumberInt(625), "schedulesOnSheet" : NumberInt(371), "unclippedViews" : NumberInt(876), "createdOn" : ISODate("2017-10-13T18:06:45.291+0000"), "_id" : ObjectId("59e100b535eeefcc27ee0802") }, { "totalViews" : NumberInt(3642), "totalSheets" : NumberInt(393), "totalSchedules" : NumberInt(427), "viewsOnSheet" : NumberInt(1949), "viewsOnSheetWithTemplate" : NumberInt(625), "schedulesOnSheet" : NumberInt(371), "unclippedViews" : NumberInt(876), "createdOn" : ISODate("2017-10-13T19:11:47.530+0000"), "_id" : ObjectId("59e10ff3eb0de5740c248df2") }]
}
使用这种方法,我可以看到数据,但我希望能够这样做:
with open('file.json', 'r') as viewsmc: data = viewsmc.readlines()
这样输出的结果是:
['{ \n', ' "_id" : ObjectId("5af1b1fd4f4733eacf11dba9"), \n', ' "centralPath" : "XXX2", \n', ' "viewStats" : [\n', ' {\n', ' "totalViews" : NumberInt(3642), \n', ' "totalSheets" : NumberInt(393), \n', ' "totalSchedules" : NumberInt(427), \n', ' "viewsOnSheet" : NumberInt(1949), \n', ' "viewsOnSheetWithTemplate" : NumberInt(625), \n', ' "schedulesOnSheet" : NumberInt(371), \n', ' "unclippedViews" : NumberInt(876), \n', ' "createdOn" : ISODate("2017-10-13T18:06:45.291+0000"), \n', ' "_id" : ObjectId("59e100b535eeefcc27ee0802")\n', ' }, \n',
我尝试了read_json / https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html 和 load/loads(str) 等所有不同的方法和解决方案,但都没有效果。
回答:
问题出在JSON文件的格式上,我们使用 https://jsonformatter.curiousconcept.com/ 进行了测试,并使用正则表达式进行了修改。如果你有更好的建议,请告诉我。
导入re模块
with open("views3.json", "r+") as read_file: data = read_file.read() x = re.sub("\w+\((.+)\)", r'\1', data) print(x)
read_file.closed